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ABSTRACT 

Lyman-alpha (Lya) forest absorption spectra towards quasars at z ~ 6 show regions of enhanced 
transmission close to their source. Several authors have argued that the apparently small sizes of 
these regions indicate that quasar ionization fronts at z > 6 expand into a largely or partly neutral 
intergalactic medium (IGM). Assuming that the typical region in the IGM is reionized by z < 6, as 
is suggested by Lya forest observations, we argue that at least 50% of the volume of the IGM was 
reionized before the highest redshift quasars turned on. Further, even if the IGM is as much as ~ 50% 
neutral at quasar turn-on, the quasars are likely born into large galaxy-generated HII regions. The 
HII regions during reionization are themselves clustered, and using radiative transfer simulations, we 
find that long skewers through the IGM towards quasar progenitor halos pass entirely through ionized 
bubbles, even when the IGM is half neutral. These effects have been neglected in most previous 
analyses of quasar proximity zones, which assumed a spatially uniform neutral fraction. We model 
the subsequent ionization from a quasar, and construct mock Lya forest spectra. Our mock absorption 
spectra are more sensitive to the level of small-scale structure in the IGM than to the volume-averaged 
neutral fraction, and suggest that existing proximity-zone size measurements are compatible with a 
fully ionized IGM. However, we mention several improvements in our modeling that are necessary to 
make more definitive conclusions. 

Subject headings: cosmology: theory - reionization - intergalactic medium - large scale structure of 
universe 



1. INTRODUCTION 

The Epoch of Reionization (EoR), when HII regions 
grow around galaxies and/or quasars, eventually overlap 
and fill the entire IGM, is fundamental to our under- 
standing of cosmological structure formation. Detailed 
observations of the EoR will characterize the nature of 
the first luminous sources in the Universe, describe their 
impact on the surrounding IGM, and fill in a significant 
gap in our knowledge of the history of the Universe. 
Prospects for observational advances are bright: 21cm 
observations (e.g. Madau et al. 1997, Zaldarriaga et al. 

2004, for a review see Furlanetto et al. 2006a), improved 
measurements of polarization of the cosmic microwave 
background (CMB) (Zaldarriaga 1997, Kaplinghat et al. 
2003), small-scale CMB fluctuations (e.g. Zahn et al. 

2005, McQuinn et al. 2005), increasingly deep narrow- 
band Ly-a surveys (e.g. Haiman & Spaans 1999, Barton 
et al. 2004, Furlanetto 2006b, McQuinn et al. in prep.), 
optical afterglow spectra of gamma ray bursts (GRBs) 
(Barkana & Loeb 2004), quasar absorption spectra (Fan 
et al. 2006), and other probes, promise a wealth of new 
data in the near future. 

What have we learned from existing observations? 
This is, of course, an intrinsically interesting question, 
but it is also an important one for directing the design 
of future surveys and experiments. For example, cur- 
rent constraints can provide important guidance regard- 
ing the optimal target redshift range for future reioniza- 
tion surveys. 

Inferences about the duration of the EoR come from 
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measurements of anisotropies in the polarization of the 
CMB (Page et al. 2006), narrow-band surveys for Ly- 
a emitters (e.g. Malhotra & Rhoads 2005), the optical 
afterglow of a z = 6.3 GRB (Totani et al. 2006), and par- 
ticularly the absorption spectra of high redshift quasars 
(e.g. Fan et al. 2006). While valuable and exciting, 
these observations have yielded constraints on the EoR 
that are generally weak and subtle to interpret. The cen- 
tral value for the electron scattering optical depth from 
WMAP favors reionization activity at z > 11, but, at the 
1 — 0" level, current limits are consistent with rapid reion- 
ization ending near z ~ 6 (Page et al. 2006). Indeed, 
at ~ 3 — a the data are consistent with no reionization 
whatsoever (Page et al. 2006). 

Malhotra & Rhoads (2005) have constrained the ion- 
ized volume fraction in the IGM by counting the abun- 
dance of Ly-a emitters at z — 6.5, and requiring a mini- 
mum ionized volume around the sources to avoid atten- 
uating their Ly-a photons. In this manner, Malhotra & 
Rhoads quote an upper limit on the neutral volume frac- 
tion of the IGM, Ahi < 0.5. In reality, the sources likely 
reside in much larger ionized regions (e.g. Furlanetto et 
al. 2006b), and this constraint should be tightened with 
more detailed modeling. For example, Dijkstra at al. 
(2006) find that these observations imply Ahi < 0.2. 

The optical afterglow spectra of z > 6 GRBs can, in 
principle, be used to search for damping-wing absorption 
redward of Ly-a, a signature of a largely neutral IGM 
(Miralda-Escude 1998, Barkana & Loeb 2004). However, 
most GRB optical afterglows show evidence for damped 
Lyman-alpha absorbers associated with their host galax- 
ies, which are problematic to distinguish from a largely 
neutral IGM (Totani et al. 2006). Even so, Totani et 
al. (2006) use the Ly-/3 region of a z = 6.3 afterglow to 
argue against a largely neutral IGM, giving a constraint 
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of X m < 0.6. 

Presently, the most detailed information on the state 
of the IGM at z ~ 6 comes from Ly-a forest absorption 
towards high redshift quasars (e.g. Fan et al. 2006). It 
is difficult to derive constraints on the ionization state of 
the IGM from the spectra of these quasars, owing to the 
large absorption cross section for Ly-a photons. Indeed, 
a highly ionized IGM (Xhi > 10~ 4 ) results in complete 
absorption in a z ~ 6 Ly-a forest spectrum (Gunn & 
Peterson 1965). In spite of this intrinsic complication, a 
little ingenuity has led to substantial progress. 

For example, one can measure absorption in the Ly-/3 
and Ly-7 troughs of a quasar spectrum. Owing to their 
weaker absorption cross sections, when these transitions 
are saturated they imply tighter constraints on the ion- 
ization state of the IGM than the optical depth in Ly-a. 
This approach has been used to suggest that the IGM is 
evolving rapidly near z ~ 6 (e.g. Fan et al. 2002, Cen & 
McDonald 2002, Lidz et al. 2002, Fan et al. 2006). The 
constraints are, however, consistent with a mostly ion- 
ized IGM. Furthermore, the transmission is influenced 
strongly by rare underdense regions at high redshift, and 
so the conclusions depend sensitively on the probability 
distribution of gas in the IGM (Oh k Furlanetto 2005, 
Becker et al. 2006a). Another interesting statistic is to 
consider how much the absorption, averaged over large 
stretches of spectra, varies from sightline-to-sightlinc. 
Close to and during reionization, the ultraviolet radi- 
ation background should fluctuate strongly (e.g. Zuo 
1992, Wyithe & Loeb 2005a) and potentially increase 
the sightline-to-sightline scatter in the mean absorption. 
However, current measurements are broadly consistent 
with density fluctuations alone (Lidz et al. 2006a, Liu et 
al. 2006). 

It is also possible to use a metal line tracer of the ioniza- 
tion state of the IGM, such as OI, which conveniently lies 
redward of Ly-a and has an ionization potential similar 
to that of hydrogen (Oh 2002). In fact, high-resolution 
Keck spectra of the z ~ 6 SDSS quasars do reveal some 
OI lines, with 4 out of 6 detected systems lying towards 
the highest redshift quasar known (Becker et al. 2006b) . 
Interestingly, some of the OI systems are nearby regions 
that show transmission in the Lya and Ly/3 forests of 
this quasar (Becker et al. 2006b). The interpretation of 
these observations is unclear: the OI systems might re- 
flect dense clumps of neutral gas in a highly ionized IGM, 
or instead could indicate inhomogeneous metal pollution 
in a more neutral IGM. 

Finally, the tightest constraints claimed on the ioniza- 
tion state of the z ~ 6 IGM come from measurements of 
the proximity regions around z ~ 6 quasars. Several au- 
thors, starting with Wyithe & Loeb (2004), have argued 
that these regions are small, indicating that quasar ion- 
ization fronts are expanding into a largely neutral IGM. 
Mesinger & Haiman (2004) claim to detect the edge of 
a quasar ionization front around one of the SDSS z ~ 6 
quasars, using additional leverage from the Ly/3 region, 
and suggest that the neutral fraction is Xhi > 0.25 at 
z ~ 6. These authors' constraint comes not so much 
from the apparent size of the proximity zone, but from 
the detailed radial dependence of the transmission and 
from a stretch of spectrum with Ly/3 transmission and 
no corresponding Lya transmission. They attribute this 
to damping wing absorption, a signature of a partly neu- 



tral IGM. Oh & Furlanetto (2005), however, argue that 
such stretches occur at high redshift even when the IGM 
is highly ionized and may not indicate damping wing 
absorption. Fan et al. (2006) emphasize caution in in- 
terpreting proximity region measurements, but observe 
rapid evolution in the sizes of quasar proximity regions 
from z — 5.7 — 6.3, arguing for a correspondingly rapid 
evolution in the neutral fraction. Their final constraint 
is based on only the evolution of the proximity region 
size, which they argue reflects the change in the neu- 
tral fraction. Consequently, they quote a significantly 
more conservative limit than previous authors, requiring 
a volume-weighted neutral fraction of only JThi > 10~ 3 . 

Recently, more detailed proximity zone calculations 
have been performed. Bolton & Haehnelt (2006) studied 
quasar transmission carefully using ID radiative transfer 
calculations, and determined that it is difficult in general 
to distinguish highly ionized and mostly neutral models 
with proximity zone measurements. Maselli et al. (2006) 
came to a similar conclusion. Mesinger & Haiman (2006), 
however, argue that detailed fitting of the transmission 
pdf in the quasar proximity zones favors a partly neutral 
IGM, with a lower limit of xm > 0.033, and a consider- 
ably larger preferred value. 

In each of these studies, the authors have assumed that 
quasar ionization fronts expand into a uniformly ionized 
surrounding IGM, with some low level, yet homogeneous 
background ionization. If the z > 6 quasar spectra truly 
probe the pre-reionization epoch, then this is likely a very 
poor approximation. The pre-reionization IGM should 
resemble swiss cheese (Loeb 2006), with large HII re- 
gions forming around clustered galaxies embedded in a 
surrounding neutral IGM (e.g. Sokasian et al. 2003, 
Ciardi et al. 2003, Furlanetto et al. 2004a,c, Iliev et 
al. 2006, Zahn et al. 2006, McQuinn et al. 2006a, Trac 
& Cen 2006). Naively, this complicates distinguishing 
partly neutral and highly ionized models on the basis of 
quasar proximity zones. Quasar ionization fronts may 
extend further along sightlines that traverse several ion- 
ized HII regions, and be more limited along other direc- 
tions that traverse several neutral patches. Moreover, 
transmission at the edge of the proximity zone may be 
related to background galaxies rather than the quasar 
itself, making it still harder to locate the 'edge' of the 
proximity zone using absorption spectra (see also Wyithe 
& Loeb 2006a). Our present paper extends these earlier 
works, and focuses on how 'patchy reionization' impacts 
quasar proximity zones. 

The outline of our paper and our basic line of argu- 
ment is as follows. Given that most of the volume of the 
IGM appears to be highly ionized by z < 6, we argue 
that reionization is unlikely rapid enough for the IGM 
to be mostly neutral when the highest redshift quasars 
observed turned on (fJ2]). From these considerations, we 
suggest that the IGM is at least 50% ionized at quasar 
turn-on. In Sj3j we describe 3D radiative transfer calcu- 
lations for plausible partly neutral models, detailing the 
initial ionization state of the IGM. Here we argue that the 
highest redshift quasars are born into large HII regions, 
even if as much as 50% of the volume of the IGM is neu- 
tral. Yu & Lu (2005) previously argued that overdense 
z ~ 6 quasar environments should reionize before typical 
regions, but here we examine the consequences of this 
in more detail, and arrive ultimately at somewhat differ- 
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ent conclusions. Even in this maximally (50%) neutral 
scenario, we find long skewers towards quasar progeni- 
tor halos which pass entirely, or predominantly, through 
ionized bubbles. 

Using the initial ionization field from our 3D calcula- 
tions as input, we perform (more detailed) fD radiative 
transfer calculations, describing the subsequent propaga- 
tion of quasar ionization fronts (@. Here we find that 
quasar front extents, in our patchy reionization mod- 
els, depend sensitively on the long ionized pathways cre- 
ated by surrounding galaxies before the quasar is born. 
Since quasars are typically born into large HII regions, 
the fronts tend to extend further in patchy reionization 
models than in models with a uniform IGM of the same 
neutral fraction, although with significant sightline-to- 
sightline variation. In $5] we construct mock absorption 
spectra and show, in agreement with previous authors 
(Bolton & Haehnelt 2006, Maselli et al. 2006), that it 
is difficult to accurately recover the position of a quasar 
front, and that estimates of the front position from ab- 
sorption spectra are generally underestimates. This is a 
consequence of the typically high Lya opacity at the edge 
of a z ~ 6 quasar front. We suggest, however, an alterna- 
tive algorithm for finding front positions from absorption 
spectra. We then estimate the importance of unresolved 
(in our numerical simulations) small scale structure on 
our mock absorption spectra. Our results are more sen- 
sitive to the highly uncertain level of small scale struc- 
ture in the IGM than to the volume-weighted ionization 
fraction. Finally, in Sj6]we conclude and discuss possible 
future research directions. 

Throughout, we assume a flat, ACDM cosmology pa- 
rameterized by: f2 m = 0.3, £1a = 0.7, £1/, = 0.04, 
H = lOO/i km/s/Mpc with h — 0.7, and a scale-invariant 
primordial power spectrum with n = 1, normalized to 
ag(z = 0) = 0.9. Our adopted value for as is a lit- 
tle larger than the central value favored by 3-year con- 
straints from WMAP (Spergel et al. 2006), but uncer- 
tainties in the cosmological model are sub-dominant to 
other aspects of our theoretical modeling, which we de- 
tail subsequently. 

2. INITIAL CONDITIONS 

Now, we consider the question: what should we expect 
for the ionization state of the gas around z > 6 quasars 
when they turn on? In this section, we address this issue 
using a rough analytic model; in the subsequent section 
we describe 3D radiative transfer calculations which ex- 
amine this in more detail. 

First, recall that constraints from z < 6 quasar spectra 
suggest that the IGM is highly ionized by z < 6 (e.g. 
Fan et al. 2002), although see SjBJfor a critical discussion. 
To date, the highest redshift quasar observed is at z q = 
6.42. If this source turns on over roughly a Salpeter 
time (t s ~ 4 x 10 7 yrs) before being observed, then its 
turn-on redshift is z on ~ 6.66. For the surrounding IGM 
to be mostly neutral at quasar birth, yet highly ionized 
by z < 6, reionization must proceed extremely rapidly, 
occurring over a short redshift span of Az < 0.7 or ~ 
10 8 yrs. Obviously, reionization must occur even more 
rapidly if measurements truly indicate that other quasars 
at slightly lower redshifts with z q > 6, (e.g. the z q = 
6.28 quasar SDSS J1030+0524), are also born when the 
surrounding IGM is highly neutral. 



Second, the z > 6 quasars are thought to reside in very 
rare and massive host halos (Fan et al. 2001, Li et al. 
2006), and hence the surrounding IGM is likely overdense 
out to large scales around the quasars (Loeb & Eisenstein 
1995, Barkana 2004, Faucher-Giguere et al. 2007). These 
regions should reionize before typical ones since halo col- 
lapse and galaxy formation are expected to occur earlier 
in large-scale overdensities (e.g. Barkana & Loeb 2004). 
The abundance of ionizing sources in an overdense re- 
gion at a given time should resemble the abundance of 
sources in a typical region at a later time. Even if typi- 
cal regions in the IGM are neutral when quasars turn on, 
the same may not be true of the overdense environments 
where quasars reside (see also Yu & Lu 2005). Recom- 
binations are also more efficient in overdense regions - 
this could offset the tendency for overdensities to reion- 
ize first, but this is unlikely to remove the trend. This 
is because galaxy formation is more sensitive to large 
scale overdensity than the recombination rate - indeed, 
the abundance of high mass halos is exponentially sensi- 
tive to the large scale overdensity (Barkana & Loeb 2004, 
Furlanetto & Oh 2005, Wyithe & Loeb 2006b). 

2.1. Globally-Averaged Ionization Fractions 

In this section we illustrate these effects using simple 
analytic estimates. We perform these calculations with 
the analytic model of Furlanetto et al. (2004). In its 
simplest incarnation, this method assumes that a galaxy 
of mass M ga ] can ionize a surrounding mass in the IGM 
of Mion = C-^gai- With this assumption, one can show 
that the average ionization fraction of gas in a region of 
radius R and linear overdensity 5 R is given by: 



(xi\6 R ,R) = ((f co ii\M mhu R,5 R ). 



(1) 



Here, (fcoii\M min , R, 5r) is the fraction of mass in halos 
with mass larger than M m i n in a region of over-density 5 R 
and radius R. In this equation, the impact of recombi- 
nations is absorbed into the parameter £. For our simple 
estimates below we hence ignore the recombination-rate 
enhancement in over-dense regions. 

The collapse fraction in an overdense region is given 
by: 



(/con | -Mmin, $c, M, 5m) = erfc 



mm * U M 



, (2) 



where S c is the critical overdensity for collapse, M indi- 
cates the mass scale corresponding to the spatial scale R, 
and c^jjj and a M are the variance of the linear density 
field smoothed on mass scales M m i n and M, respectively. 
The global collapse fraction follows from this expression 
in the limit 5m —> 0, o~m ~* 0. This equation implies that 
overdense regions will have larger collapse fractions than 
regions at the mean density and, under the assumptions 
of Equation (JTJ), will be ionized earlier. 

Consder, first, the redshift evolution of the globally- 
averaged ionization fraction, by taking 5 R — + and R — ► 
oo in Equation ([1]). The precise duration of reionization 
depends sensitively on the nature of the sources produc- 
ing ionizing photons at early times, which is highly uncer- 
tain. If rare, yet efficient sources reionize the IGM then 
the collapse and ionization fractions should grow rapidly 
with redshift (Equations [I] and [SJ. Furthermore, the du- 
ration of the reionization epoch depends on the efficiency 
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of thermal feedback, the time taken to photo-evaporate 
mini-halos in the IGM, as well as the detailed proper- 
ties and evolution of the cosmic star-formation rate and 
the fraction of ionizing photons escaping into the IGM. 
In our model, all of these details are subsumed into a 
single, redshift-independent parameter, £ (Equation [lj. 

Nevertheless, we can roughly gauge the range of possi- 
bilities by varying the parameter M mm in Equation ([2]), 
in each case normalizing £ to match (xj) = 1 at z = 6. 
Of course, reionization may finish at significantly higher 
redshift (i.e., (xi) — 1 is reached at z > 6); our goal here 
is to find the maximally neutral case at quasar turn-on 
provided that the entire volume is reionized by z < 6. 
We consider a wide range of values for M m i n . On the 
low end, we adopt M lmn = M coo \, and on the high end 
we take M min = 10 3 M cool . Here, M cool denotes the dark 
matter halo mass corresponding to a virial temperature 
of 10 4 K (M coo i = 1-3 x 10 8 M Q at z ~ 7, e.g. Barkana 
& Loeb 2001), above which atomic line cooling is effi- 
cient, allowing gas to cool, condense to form stars, and 
produce ionizing photons. The higher minimum source 
mass scenarios approximate models where photo-heating 
has limited the efficiency of star-formation in small mass 
halos (Thoul & Weinberg 1996, Navarro & Steinmetz 
1997, Dijkstra et al. 2004), and models in which su- 
pernova winds suppress star-formation in low mass halos 
(e.g. Springel & Hernquist 2003). We note that our high- 
est minimum source mass model (M min = 10 3 M coo i) is 
rather extreme, considerably larger than the suppression 
masses suggested by the above studies. We consider this 
case anyway to illustrate a plausible upper limit. 

We show the redshift evolution of the globally-averaged 
ionization fraction in the bottom panel of Figure [TJ The 
first qualitative feature apparent from the Figure is that 
reionization is generally quite extended: for example, in 
the cooling mass model the average ionization fraction 
is (x^ = 0.1,0.5,0.7,1. at z ~ 12,8,7, and z ~ 6 re- 
spectively. If the highest redshift quasar observed thus 
far, at z q — 6.42, turns on at z on ~ 6.66 as motivated 
above, the average ionization fraction in this model is 
~ 80% at turn-on. The process is less extended if the 
sources are rare, yet very efficient. For example, in the 
M min = 10 2 M coo i case, the IGM is only ~ 60% ionized 
at quasar turn on. However, even in the very extreme 
Mmin — 10 3 M coo i scenario, 50% of the IGM is ionized 
by quasar turn on at z ~ 6.66. The ionization fraction 
evolves more rapidly in the high minimum mass models, 
since here the host halos are still on the exponential tail 
of the mass-function near z ~ 6, and hence their abun- 
dance evolves quickly with redshift. Note that we likely 
over-estimate this effect, since we use Press-Schechter 
(1974) theory for simplicity: the halo abundance from 
recent high redshift numerical simulations more closely 
matches the Sheth-Tormen (1999) mass function, indi- 
cating that Press-Schechter theory underestimates the 
abundance of rare halos (Reed et al. 2003, Heitmann et 
al. 2006, Zahn et al. 2006, Lukic et al. 2007, although 
see Trac & Cen 2006). Note that although our constraint 
argues that the IGM is at least somewhat ionized when 
the highest redshift quasar observed turns on, it is not 
in conflict with the formal limits from Wyithe & Loeb 
(2004), Wyithe et al. (2005a), and Mesinger & Haiman 
(2004, 2006). 

Furthermore, we have taken £ as a free parameter, sim- 
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Fig. 1. — Redshift evolution of the volume-averaged ionization 
fraction. Top panel: The black solid line shows the redshift evo- 
lution of the ionization fraction in a typical region of the IGM, 
with the ionizing source efficiency calibrated so that {x^ = 1 at 
2 = 6, assuming all halos down to M coo \ contribute ionizing pho- 
tons with equal efficiency. The other curves indicate the average 
ionization fraction in spherical regions of different size, each with 
a massive quasar host (here M<im,liost = lO 13 A/0) at the center. 
The vertical blue dotted line indicates a plausible turn-on time for 
a 2 = 6.42 quasar. The curves illustrate that: i) the process of 
reionization takes a sufficiently long time that, even if it completes 
at z = 6, the typical location in the IGM is already ~ 80% ion- 
ized (in this model) when the z = 6.42 quasar turned on; and ii) 
provided that quasar host halos reside in highly overdense regions, 
their surroundings will reionize earlier than the typical region in 
the IGM. Bottom panel: The redshift evolution of the volume- 
averaged ionization fraction for a typical region as a function of 
the minimum host halo mass. If rare, very efficient, sources pro- 
duce most of the ionizing photons, reionization occurs more rapidly 
than if highly abundant yet less efficient sources produce most of 
the ionizing photons. In each model, more than 50% of the IGM 
volume is ionized at quasar turn-on. 

ply fixing it to match (x^ = 1 at z ~ 6. In fact, if 
all of the ionizing photons are indeed produced by the 
very rare halos with M > M ro ; n = 10 3 Af coo i, the sources 
need to be exceedingly efficient to reionize the IGM by 
z < 6. Specifically, we estimate the number of ioniz- 
ing photons per stellar baryon required in this model to 
achieve (x^ — 1 by z — 6, using Equations (82) and (83) 
of Furlanetto et al. (2006a). We adopt typical values 
of three recombinations per ionized hydrogen atom, an 
escape fraction of / osc = 0.1, and a star- formation effi- 
ciency of /* = 0.1, and find that N 7 ~ 70,000 ionizing 
photons per baryon are required in this scenario. This 
ionizing efficiency is substantially larger than the value 
expected for a Salpeter IMF, N 7 ~ 4, 000 (e.g. Cohn 
& Chang 2006), although it is achievable if the IMF is 
very top-heavy (e.g. Bromm et al. 2001) even at z ~ 7 
in spite of apparently wide-spread metal enrichment by 
z ~ 6 (Ryan- Weber et al. 2006). 

Finally, in we argue that it is conceivable that 
(x^ = 1 is achieved later than z = 6 - reionization 
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might end at as low a redshift as z ~ 5.5. Even if 
rcionization completes only by z = 5.5 we find that the 
neutral fraction at quasar turn-on is less than 50% in 
all of our models except our extreme M m ; n = 10 3 M coo i 
model. Moreover, reionization is likely more extended 
than in our model - for example, thermal feedback and 
mini-halos can extend the duration of reionization com- 
pared to our simple predictions (e.g. Haiman & Holder 
2003, McQuinn et al. 2006a). Therefore, our simple es- 
timates indicate that the IGM is unlikely as much as 
50% neutral at quasar turn-on and is more likely at least 
70% ionized by this time. These rough estimates are 
inevitably model dependent, but are in accord with ob- 
servational constraints from Lya emitters (Malhotra & 
Rhoads 2005, Haiman & Cen 2005, Dijkstra et al. 2006), 
and GRB optical afterglow spectra (Totani et al. 2006) 
which coincidentally give similar upper limits for the neu- 
tral fraction near the plausible turn-on redshifts of the 
z > 6 quasars. 

2.2. The Ionization Field Around Quasar Progenitor 

Halos 

The above calculations apply to typical regions in the 
IGM. We now extend our analysis to consider the ion- 
ization of the overdense environments expected around 
high redshift quasars. We will compute the probability 
distribution of the ionized fraction spherically-averaged 
around plausible quasar host halos, prior to quasar turn- 
on. Here, our calculation is similar to previous work by 
Yu & Lu (2005), but it differs in detail. We first consider 
the linear overdensity profiles around the z ~ 6 quasars 
following Loeb & Eisenstein (1995) and Barkana (2004). 
We will subsequently insert the linear density profile into 
Equation |T]). The first quantity of interest is the cross- 
correlation coefficient, p r ,R) between density fluctuations 
at a single point when smoothed on two different scales, 
r and R. The correlation coefficient is related to the 
co- variance, offo anc ^ individual variances, of, a R , by 

the relation p T ^ = of R / ' y/ 1 o 2 a R . The covariance can be 
expressed as an integral over the linear density power 
spectrum, P(k), through the relation 



'r,R 



d 3 k 



W(kr)W{kR)P(k). 



(3) 



If the window functions in this equation are (spheri- 
cally symmetric) top-hat filters in fc-space then the co- 
variance is precisely equivalent to the variance on scale R 
- i.e., of R = a R , and the correlation coefficient is given 
by p r ,R = o~n/cr r . The co- variance will be a little differ- 
ent if the window functions are top-hats in real space. In 
spite of this, for simplicity we adopt the sharp /c-space 
correlation coefficient even though we calculate aR and 
ay using a real-space top hat, as is frequently done in 
Press-Schechter type calculations. 

In what follows, we ignore the 'cloud-in-cloud' problem 
of Press-Schechter theory, and do not include an 'ab- 
sorbing barrier' in our calculation (Bond et al. 1991). 
Barkana (2004) constructs a more elaborate model for 
the initial overdensity profile around massive halos that 
is consistent with extended Press-Schechter theory. How- 
ever, in the limit of the very rare halos we consider here, 
Barkana (2004) shows that this more elaborate model 
reduces to our present one, which we hence adopt for 



simplicity. Assuming the density field is a bi-variate 
Gaussian with the correlation coefficient given above, we 
can immediately write down an expression for the de- 
sired conditional probability distribution: we would like 
to know the differential probability that a point is at a 
linear overdensity Sr when smoothed on a scale R at 
some redshift z\, given that it is at linear overdensity 5 C 
when smoothed on a smaller scale, r at redshift z 2 . This 
expression is: 



P{6 R ,o-R,zi\6 c ,o- r ,z 2 ) = ■ 
(S c - <5 R ) 2 
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In this equation, all quantities are linearly- 
extrapolated to the present day. Hence, if 5r is 
the linear overdensity at smoothing scale R and redshift 
Zi, and D(z{) is the linear growth factor normalized to 
unity today, then Sr = 5r/D(zi) is the linear overden- 
sity on scale R today. If we take S c to be the critical 
overdensity for collapse scaled to the present day, then 
this equation tells us the conditional probability that a 
region will have a large scale overdensity, Sr on scale R, 
given that the region contains a massive halo - i.e., the 
region reaches the collapse threshold, S c , at a smaller 
smoothing scale, r. Combining Equations (JTJ) , ©, and 
([4]), and computing the Jacobian \dxifddR\~ 1 , we can 
determine the desired ionization probability distribution. 
The ionization fraction, averaged over an ensemble of 
quasar host halos, just follows from determining the 
mean of the resulting ionization probability distribution. 
Note that in this calculation we neglect sources outside 
the region of interest, but since we are interested in 
rather large smoothing scales, this is probably not too 
poor an approximation. 

Our results for the average ionization in regions that 



will host a massive quasar host halo at z 2 



6.42 



are shown in the top panel of Figure [TJ Here we as- 
sume that z > 6 quasars reside in 1O 13 M halos as in 
Haiman & Loeb (2001) and Li et al. (2006), although 
this choice is uncertain. Our simulation results in sub- 
sequent sections assume quasars reside in slightly less 
massive and more common halos. For the present calcu- 
lation we adopt our model in which all host halos down 
to Mcooi contain ionizing sources and contribute to reion- 
ization. Figure [T] demonstrates that a significant volume 
of gas around quasar progenitors is generally reionized 
well before reionization completes in a typical region of 
the IGM. Indeed, ~ 100% of the volume in a sphere of ra- 
dius 15 co-moving Mpc centered on the z = 6.42 quasar 
progenitor is ionized in our model, on average, by z > 7, 
~ 50 Myr before the quasar turns on. As one averages 
over progressively larger volumes, the mean interior over- 
density approaches the cosmic average, and the ioniza- 
tion approaches its cosmic mean value. Indeed, the figure 
indicates that spheres of radius 35 — 45 Mpc/h appear 
to be essentially representative samples. Given that the 
purported proximity zone size is ~ 40 co- moving Mpc//i 
(e.g. Wyithe et al. 2005a), one might conclude that this 
bias is hence unimportant for interpreting quasar prox- 
imity zone measurements. In fact, we will show in the 
next section that ID skewers through the IGM towards 
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Fig. 2. — Probability distribution for the spherically-averaged 
ionization fraction when the z = 6.42 quasar turns on in our model 
for several different smoothing scales. The curves show pdfs for 
the fraction of the volume that is ionized in regions of different 
size centered on the quasar host halo. A fraction of Xj > 1 simply 
indicates that there are enough photons to ionize the entire volume 
of the region in question. 



quasar host halos can pass entirely through HII bubbles 
out to much larger distances. Therefore the tendency 
for quasars to be born into large HII bubbles is in fact 
important for interpreting the proximity zones in z ~ 6 
quasar absorption spectra. 

We can also examine the full probability distribution 
of the spherically-averaged ionization fraction around 
quasar host halos, rather than just the mean of this dis- 
tribution. The results of this calculation are shown in 
Figure EH for M halo = 10 13 M Q and M min = Af cool at 
z on = 6.66. Note that the curves extend in some cases 
beyond Xi = 1, which just implies that there are more 
than enough photons to ionize the entire region in ques- 
tion. On small smoothing scales, there is significant scat- 
ter in the ionization from host halo to host halo. Indeed, 
there is sonic probability that a region will be less ionized 
than the cosmic mean, although essentially all host halo 
regions are more than 60 — 70% ionized in this model. 
This scatter is potentially important for quasar proxim- 
ity zone observations. 

In models where rarer sources dominate the ionizing 
photon budget, we expect the tendency for overdense re- 
gions to reionize first to be enhanced. Indeed, since reion- 
ization progresses most rapidly in such models, they rep- 
resent the most plausible scenarios in which the IGM is 
considerably neutral at quasar turn-on yet highly ionized 
by z = 6 (Figure [1]). On the other hand, we ignored re- 
combinations in our analysis, and these should lessen the 
tendency somewhat for overdense regions to ionize first 
(e.g. Wyithe & Loeb 2006b), although the precise impact 
of this process will depend on the number of recombina- 
tions per hydrogen atom during reionization, which is 



uncertain. 

2.3. Quasar Host Galaxy: Ionizing Photon Budget from 
Stellar and Quasar Activity 

A separate but related question regards the relative 
contribution of starburst and quasar activity to the cu- 
mulative budget of ionizing photons released by the 
quasar host galaxy alone (see also Yu & Lu 2005, Wyithe 
& Loeb 2006a). Indeed, the z ~ 6 quasar hosts contain 
~ solar metallicity gas suggesting substantial levels of 
prior star formation (e.g. Barth et al. 2003). Our previ- 
ous arguments imply that neighboring galaxies reionize 
much of the surrounding IGM before the highest redshift 
quasar turns on. Here, we focus solely on the budget of 
ionizing photons produced within the quasar host galaxy 
itself. 

We can estimate the cumulative output of ionizing pho- 
tons from quasar and stellar sources on the basis of ener- 
getic arguments, combined with assumptions regarding 
the radiative efficiency, the escape fraction of ionizing 
photons, and the spectral energy distribution of emit- 
ted radiation from quasars and stars respectively, along 
with an assumed relation between black hole and stellar 
mass. A galaxy with a stellar-mass, M+, produces a cu- 
mulative number of ionizing photons (escaping into the 
IGM) given by -/V io „,* ~ feac,*N*rM*/mp. Here, A 7 is the 
number of ionizing photons produced per stellar baryon, 
/esc,* is the escape fraction of stellar ionizing photons, 
and m p is the proton mass. The energy radiated while 
forming a black hole of mass msH is -Erad ~ e ra d"iBHC 2 , 
for a radiative efficiency of e ra d- Assuming that a frac- 
tion /i on of this energy comes out in ionizing photons, 
with a fraction / CS c,bh of such photons escaping into the 
IGM at a mean energy of (hv)- lon , the cumulative num- 
ber of ionizing photons produced by quasar activity is 
Aion.BH /ion /esc, BHeradC 2 /(/ii / )ion- We are interested 
in the ratio of photons produced by quasar and stellar 
activity. We estimate this ratio assuming that the lo- 
cal Magorrian relation connecting stellar and black hole 
mass is upheld at high redshift, as supported by the z ~ 6 
black hole growth simulations of Li ct al. (2006). We as- 
sume typical values for the other parameters (e.g. Cohn 
& Chang 2006) arriving at: 



A i( , 



ion.BH 



.* N M* {hv) 



/esc,BH 
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For our fiducial parameters, the cumulative numbers 
of ionizing photons from quasar and stellar activity as- 
sociated with the quasar host galaxy alone are already 
comparable, although uncertainties in these parameters 
may allow this conclusion to change by ~ an order of 
magnitude. For example, if the quasar turns on before 
most of the stars and lies off of the local Magorrian rela- 
tion, the ratio of stellar to quasar photons will be smaller 
than in the above ratio. This late star formation scenario 
may be in tension with the presence of ~ solar metallic- 
ity gas in the quasar host as mentioned above. We view 
this simple estimate as a further argument that one needs 
to consider the ionization from surrounding galaxies, in 
conjunction with that from the quasar itself, in order 
to model high redshift quasar proximity zones. On the 
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other hand, note that although quasar and stellar activ- 
ity associated with the quasar host galaxy cumulatively 
produce comparable numbers of ionizing photons, the in- 
stantaneous photoionization rate from the quasar greatly 
exceeds that from stars during the lifetime of the quasar 
(see gj. 

3. 3D RADIATIVE TRANSFER CALCULATIONS 

In this section, we use 3D radiative transfer calcula- 
tions to visualize the effects discussed above, and to de- 
tail the plausible ionization state of the IGM when the 
highest redshift quasar turns on. Here we aim to char- 
acterize the ionization field produced by high redshift 
galaxies prior to quasar turn-on, and to quantify its in- 
homogeneities. 

3.1. Simulations 

We use the 3D radiative transfer calculations from 
McQuinn et al. (2006a). These simulations follow the 
growth of HII regions in a cubic box of co-moving side- 
length Lbox = 65.6 Mpc/h. The radiative transfer cal- 
culations start from an N-body simulation run with an 
enhanced version of Gadget-2 (Springel 2005), tracking 
1024 3 dark matter particles, and resolving dark matter 
halos with mass M > 2 x 10 9 M Q . Ionizing sources are 
placed in simulated dark matter halos, using a simple 
prescription to connect ionizing luminosity and halo mass 
(Zahn et al. 2006). Our fiducial ionizing source prescrip- 
tion is identical to that described in Lidz et al. (2006b): 
halos with mass larger than M m i„ = 2 x 10 9 M Q host ion- 
izing sources, with an ionizing luminosity proportional 
to halo mass. The simulated dark matter density field is 
then interpolated onto a 512 3 Cartesian grid, and we sub- 
sequently assume that the gas density closely tracks the 
simulated dark matter density field (see Zahn et al. 2006 
for a discussion). Finally, radiative transfer is treated in 
a post-processing stage using the code of McQuinn et al. 
(2006a), a refinement of the Sokasian et al. (2001, 2003, 
2004) code, which in turn uses the adaptive ray-tracing 
scheme of Abel & Wandelt (2002). 

How sensitive are the HII bubble sizes at different 
stages of reionization to the details of our modeling? 
McQuinn et al. (2006a) demonstrate that the size distri- 
bution of HII regions during reionization depends most 
strongly on the ionized fraction, and is less sensitive to 
the precise redshift at which a given ionization fraction 
is reached, and other details of the reionization process. 
These authors did find, however, larger HII regions at 
a given ionized fraction if rare, highly biased sources 
produce most of the ionizing photons. Furthermore, if 
mini-halos survive pre-heating and are sufficiently abun- 
dant during reionization, this reduces the characteristic 
HII region size (Furlanetto & Oh 2005, McQuinn et al. 
2006a) . Note that the minimum source mass in our fidu- 
cial simulation is ~ 10A/ coo i, but even if this is a slight 
overestimate it should have little impact on the bubble 
size distribution at fized ionization fraction (McQuinn 
et al. 2006a). In this paper we adopt the fiducial source 
prescription mentioned above - we refer the reader to the 
earlier papers for details regarding the model dependence 
of our bubble size distributions. 

We now characterize the ionization field around plausi- 
ble quasar host halos. In practice, we examine outputs at 
several different redshifts and ionization fractions. How- 



ever, based on the McQuinn et al. (2006a) findings, we 
expect this to be roughly equivalent to examining the 
ionization field at fixed redshift, yet varying ionization 
fraction. Hence, we generally parameterize our partly 
neutral models by the volume-weighted ionization frac- 
tion. 

3.2. Ionization Maps 

First, we use our radiative transfer simulation simply 
to visualize the trends suggested by the analytic argu- 
ments of the previous section. In Figure [3] we show 
thin slices (1 cell thick = 0.13 Mpc/h) through the 
simulated ionization field at three different stages dur- 
ing reionization (volume- weighted ionization fractions of 
(xi) = 0.31,0.48, and 0.70). In the left-hand side panels 
we show the ionization field close to a plausible quasar 
host halo (with Mh a i = 3 x 1O 12 M0). For contrast, in 
the right-hand side panels we show the ionization field 
through random slices of the same simulation outputs. 

The figure provides a clear visualization of several 
points alluded to in the previous section. First, overdense 
environments are more ionized than typical regions: the 
eventual quasar host halo is surrounded by a large HII 
region blown out by galaxies born before the quasar it- 
self turns on. Second, the HII regions are themselves 
correlated over large scales and so the host halo bubble 
tends to be surrounded by large neighboring HII regions. 
These features are seen clearly at each of a few different 
stages during reionization by contrasting the region con- 
taining the quasar host halo in the left panel with the 
random regions shown in the right panel. The bottom 
two panels are likely more relevant than the top panels 
given the arguments of the previous section which sug- 
gest (xi) > 0.5 at quasar turn-on. Note also that our 
limited simulation volume likely impacts our ability to 
capture large HII bubbles at the end of reionization. In 
particular, while our general point should be robust to 
our limited simulation volume, the detailed results may 
be affected - particularly at (xi) = 0.7 - as one might 
expect from the large HII regions shown in Figure [3J 

3.3. Statistical Description of Ionization Fields around 
Quasar Progenitors 

We can quantify the qualitative features of Figure [3] in 
two useful ways. The first such statistical measure is to 
compare the average ionization fraction in spheres of dif- 
ferent size, with each sphere centered on a massive halo. 
This calculation is essentially a simulation version of the 
analytic calculation shown in Figure [2] For comparison, 
we show the results at each of (x.^) — 0.48 (roughly the 
most neutral case plausible when the z > 6 quasars turn 
on, as discussed in the previous section), and (X4) = 0.70 
for three different mass bins. For each mass bin, regions 
surrounding massive halos are on average more ionized 
than the cosmic mean ionization level. As we discussed 
previously, this 'bias' results because overdense regions 
contain more halos and hence ionizing sources and reion- 
ize earlier than typical regions. This bias is naturally 
most significant for the largest simulated halos, which are 
likely hosts for the z ~ 6 SDSS quasars. Indeed, Figure 
[31 demonstrates that the spherically averaged ionization 
around simulated halos with M > 10 12 Af Q exceeds the 
cosmic mean ionization out to scales of R ~ 40 Mpc/h, 
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Fig. 3. — Ionization fields around a plausible z > 6 quasar host halo, compared to the ionization in a typical region of the IGM. Top panel: 
The ionization field in a thin slice (0.13 Mpc//i thick, with a side-length of Lbox = 65.6 Mpc/?t) around a halo which may subsequently 
house a z > 6 quasar. The slices show different stages of reionization in this overdense environment, when the globally-averaged, volume- 
weighted ionization fraction is (x;) = 0.31,0.48, and 0.70 (from left to right). The white regions denote ionized gas, while black regions 
show neutral gas. The black square indicates the location of the quasar host halo. Bottom panel: For contrast, the ionization field in thin 
slices through random regions of the IGM at the same global ionization fractions as in the top panel. Comparing the top and bottom 
panels, it is clear that reionization is accelerated in the overdense regions that will subsequently host the z ~ 6 quasars. This results because 
galaxy formation occurs earlier in overdense regions. 



and 75% of the volume is typically ionized within spher- 
ical shells of radii R < 10 Mpc//i, and R < 25 Mpc/h 
at (xi) = 0.48 and (x/) — 0.70 respectively. The true 
bias is potentially even larger, since the z ~ 6 quasars 
may reside in even rarer, more massive halos than con- 
tained in our simulation volume (Haiman & Loeb 2001, 
Fan et al. 2001, Li et al. 2006). Our most massive sim- 
ulated halo at this redshift has a dark matter mass of 
M h = 3 x 1O 12 M . 

Also relevant is the level of halo-to-halo scatter in the 
spherically averaged ionization fraction. In Figure 2] we 
show the 1 — a scatter across halos in the mass bin with 
lO n M < M < 10 12 M Q . The scatter is quite signifi- 
cant: for example, there are regions of R > 10 Mpc/h 
centered on massive halos that are less ionized than the 
cosmic mean, even though such regions are on average 
more ionized than the cosmic mean level. This scat- 
ter indicates that variations in the initial ionization field 
around quasar host halos are substantial, and can poten- 
tially lead to significant sightlinc-to-sightline scatter in 
the proximity zone size - provided that these observa- 
tions indeed probe the pre-reionization IGM. The simu- 
lation results shown here are also roughly consistent with 
the analytic calculations of Figures Q] and [21 

While the spherically averaged ionization fractions 
shown above are illustrative, a ID statistic is more rel- 



evant for interpreting quasar absorption spectra. As an 
example, we extend lines of sight from the massive halos 
in our simulation box and calculate the distance traveled 
along each line of sight before crossing neutral material. 
In practice we define a cell to be neutral when the HI pho- 
toionization rate averaged over the cell is less than 10 -5 
times the cosmic mean photoionization rate. Our general 
point is, however, insensitive to this somewhat arbitrary 
definition. The results of this calculation are shown in 
Figure El where we construct the cumulative probability 
distribution of 'first neutral crossings' calculated from 
100 lines of sight, cast at random directions from each 
simulated halo with Mhaio > 10 12 Af©. We consider the 
first crossing distribution when the volume- weighted ion- 
ization fraction of the IGM is each of (xi) = 0.48 and 
(x^ = 0.70. The probability distributions shown in the 
figure are quite broad, and the mean first crossing dis- 
tance from lines of sight towards quasar host halos is 
large. Specifically, at (x,) = 0.48, the mean and 95% 
upper limit for the first crossing distance are 13 and 
28 Mpc/h respectively, while the corresponding num- 
bers are 24 and 61 Mpc/h at (x^ = 0.70. Hence, even 
in partly neutral scenarios, long skewers towards quasar 
hosts will traverse entirely through ionized bubbles be- 
fore the quasars themselves turn on. Indeed if the z > 6 
quasars turn on during the later stages of reionization, a 
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Fig. 4. — Spherically-averaged ionization fraction centered on 
halos of different mass, as a function of radius. The solid black, 
red dotted, and green dashed lines indicate the average ionization 
fraction surrounding massive halos from the simulation, while the 
red dashed band gives the 1 — a spread across different halos in 
the middle mass bin (1O 11 M < M < 1O 12 M ). The horizontal 
blue dotted line indicates the ionization fraction averaged over the 
entire simulation volume. The top panel shows the average ion- 
ization around massive halos when (xi) = 0.48, while the bottom 
panel gives the same at (xj) = 0.70. Reionization happens ear- 
lier around the overdense regions that house massive halos in our 
simulation, and hence the surrounding IGM is more ionized than 
typical regions. 



sizable fraction of skewers should pass entirely through 
ionized bubbles out to length scales even larger than ~ 40 
Mpc/h - comparable to the purported size of proximity 
zones inferred from the highest redshift quasar spectra 
(e.g. Wyithe et al. 2005a). This indicates that the 
proximity zone sizes deduced from quasar spectra are 
not primarily determined by the volume-weighted neu- 
tral fraction of the IGM, as we discuss subsequently. 

In summary, if the z > 6 quasars truly probe the 
pre-reionization IGM, the initial ionization field prior 
to quasar turn on will be inhomogeneous even on quite 
large scales. Moreover, quasars are biased tracers, re- 
siding preferentially in overdense regions, likely ionized 
before typical regions in the IGM. On spherical average, 
this bias will be relatively small once one averages over 
scales of > 30 Mpc/h. Nonetheless, ID skewers through 
the IGM may pass entirely through ionized bubbles out 
to considerably larger scales. This is another example 
of aliasing, where fluctuations in a random field along 
skewers of a given length are much larger than those 
from averaging the same field over spheres of compara- 
ble diameter (Kaiser & Peacock 1991). These results 
contradict the previous wisdom that z ~ 6 quasar prox- 
imity zones probe sufficiently large scales that ionization 
inhomogeneities average out, allowing one to model the 
surrounding IGM with some uniform, yet low level ion- 
ization field. In fact, our results suggest that these large 
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Fig. 5. — Cumulative probability distribution for first crossing 
neutral cells. The top panel shows results for an ionization frac- 
tion of (xi) = 0.48, while the bottom panel shows the probability 
distribution at (xj) = 0.70. In each panel, the red solid histogram 
gives the first-crossing distance for random sightlines emanating 
from probable quasar host halos, before the quasar turns on. The 
blue dashed histogram shows, for contrast, the first crossing dis- 
tance for sightlines emanating from a random point in the IGM. 
The first crossing distributions are quite broad, and lines of sight 
from probable quasar host halos may extend for a length compa- 
rable to the purported proximity zone size (L ~ 40Mpc/h) before 
touching any neutral cells, even prior to quasar turn-on. 

scale inhomogeneities significantly complicate the inter- 
pretation of z r-j 6 quasar proximity zones. In order to 
understand the precise implications of these initial inho- 
mogeneities, we now consider the subsequent propaga- 
tion of quasar ionization fronts. 

4. QUASAR IONIZATION FRONTS AND ID RADIATIVE 
TRANSFER 

In this section we follow the propagation of quasar ion- 
ization fronts using the calculations of the previous sec- 
tion as 'initial conditions' and considering, in particu- 
lar, the difference between the patchy-reionization mod- 
els discussed above and partly neutral models with some 
low level uniform ionization, as assumed by most previ- 
ous modelers. 

It is important to keep in mind that our ultimate goal 
is to construct mock Lya forest absorption spectra for 
each scenario and compare with observations. In this 
sense, our results will be sensitive to the precise values 
of the residual neutral fraction within the ionized bubbles 
shown in Figure [3] For example, if residual neutral frac- 
tions are at the ~ 10 -3 level, the Lya absorption spectra 
will be very different than if residual neutral fractions are 
instead only ~ 10~ 4 . This is in contrast to simulations 
of the 21 cm signal during reionization, for which the 
bubbles of Figure [3] are simply 'holes' of effectively zero 
signal, and one is hence insensitive to the small resid- 
ual neutral fraction in the interior of the bubbles (apart 
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from occasional dense clumps of more neutral gas which 
fill only a small fraction of the interior volume). Sim- 
ulating Lya absorption spectra in the pre-reionization 
epoch is therefore more challenging than characterizing 
the large-scale 21 cm signal. Moreover, our 3D radiative 
transfer code assumes a sharp ionization front, simply 
tracking the position of the edge of the ionization front, 
and assuming ionization equilibrium within the front. 
Our 3D calculations also ignore helium and assume a 
uniform temperature within the front. These approxi- 
mations speed up our 3D calculations substantially, and 
they are likely very accurate for capturing the size distri- 
bution of HII regions during reionization, and the 21 cm 
signal. However, they may be inadequate for our present 
purposes: here we want the precise residual neutral frac- 
tion and thermal structure within bubble interiors after 
ionization from quasars. Our usual approximations are 
especially dubious given that quasars have a hard spec- 
trum of ionizing photons. 

In order to do optimize our method, we supplement our 
3D analysis from the previous section with more detailed 
ID radiative transfer calculations, tracking the subse- 
quent ionization from the quasar. Here our technique is 
similar to that of Bolton & Haehnelt (2006). This ap- 
proach is also natural for this application since quasar 
absorption spectra probe only ID skewers through the 
IGM. Our ID calculations self-consistently solve for the 
optical depth, the ionization balance, and the temper- 
ature of the gas after the quasar turns on. We do not 
track the hydrodynamic response of gas to the ionization 
fronts. Of course our ID approach is also an approxima- 
tion, but we expect it to be a good one. 

4.1. Model Parameters 

Before proceeding with our ID calculations, we briefly 
discuss several uncertain parameters that go into our sub- 
sequent modeling. The first relevant quantity is the size 
distribution of HII regions at different stages of reion- 
ization, as mentioned in the previous section. This de- 
pends primarily on the nature of the ionizing sources and 
the abundance of mini-halos and Lyman-limit systems 
(Furlanetto & Oh 2005, Furlanetto et al. 2006c, Mc- 
Quinn ct al. 2006a). In the present paper, we consider 
only the fiducial ionizing source prescription described 
in the previous section. If mini-halos/Lyman- limit sys- 
tems are abundant, then the pre-quasar HII regions will 
be smaller than assumed here, while if rarer sources pro- 
duce most of the ionizing photons, the HII regions will 
be larger (see McQuinn et al. 2006a for quantitative de- 
tails) . 

4.1.1. Hydrogen Photoionization Rates 

Even given the size distribution of HII regions, the pre- 
cise hydrogen photoionization rate within bubble interi- 
ors is somewhat uncertain. It depends on the precise 
nature of the ionizing sources, the abundance of Lyman- 
limit systems and the mean free path for ionizing pho- 
tons in the bubble interiors, and other factors. In prin- 
ciple, one would hope to model all of these quantities 
self-consistently. In practice, we believe that the precise 
photoionization rates calculated from our 3D simulations 
for a given model are less robust than our predictions of 
the bubble size distribution and 21 cm signal. Owing 
to this, we will extract only the spatial fluctuations in 



Thi (Ar(a f ) = rHi(x)/(rHi)) directly from our 3D sim- 
ulation, and treat the spatial average as a free parame- 
ter. In our fiducial calculations, we somewhat arbitrarily 
normalize the volume-averaged HI photoionization rate 
to (rm) = 10~ 13 s -1 (we adopt this value for both our 
(xi) — 0.48 and our (x,) = 0.7 model). This is, how- 
ever, similar to the values measured directly from our 
simulation: (r H i) = 9.6 x 10~ 14 s~ 1 at (a*) = 0.48, and 
(r ffl ) = 2.2 x lO-^s" 1 at (x^ = 0.7. In we ex- 
amine the sensitivity of mock Lya absorption spectra to 
this choice. 

For clarity, we pause here to point out the significant 
distinction between the hydrogen photoionization rate 
in a uniformly ionized medium, and that in a 'patchy 
reionization' model. In a uniformly-ionized IGM, pho- 
toionization equilibrium implies a one-to-one relation be- 
tween the neutral fraction at a given density and the hy- 
drogen photoionization rate. It is easy to see that the 
volume-averaged hydrogen photoionization rate in our 
patchy reionization models will be vastly different than 
that in a uniformly-ionized IGM of the same (low-level) 
mean ionization fraction. 

Let us illustrate this point explicitly by considering 
two toy models. In the first case, representing patchy 
reionization, we imagine equal-sized ionized bubbles each 
with an interior neutral fraction Xm << 1, filling a 
fraction / (with < / < 1) of the volume of the 
IGM, which is otherwise completely neutral. For sim- 
plicity, we neglect helium and consider an IGM with 
a uniform density and thermal state. In the second 
model, representing 'uniform ionization', the neutral 
fraction is identical at each location within the IGM 
with X\ii = (Xhi) = 1 — /. In each case, photoion- 
ization equilibrium tells us that (Thi) = (coth)((1 — 
Xffl) 2 /Xffl). Now, in a uniformly-ionized IGM we get 
(r H i)uniform = (an H )f 2 /(l - /). In the toy patchy 
model, we have (1 — Xhi) 2 /^hi ~ 1/^in inside each 
ionized bubble, and (1 — Xm) 2 / Xm ~ outside of ion- 
ized regions. Hence, (r H i) P atch y ~ (an H )f/X m . The 
ratio of the volume-averaged photoionization rates is just 
(rHi)patchy/ (rm) uniform ^ 

(l-/)/(/X IN ). This will typ- 
ically be a very large number: for example, if 50% of 
the volume is filled by ionized bubbles (/ ~ 1/2) each 
with an interior neutral fraction of Xw ~ 10~ 4 , the vol- 
ume averaged photoionization rate is a factor of ~ 10 4 
times larger in the patchy-reionization model than in the 
uniform model. Hence the volume-averaged photoion- 
ization rates in our models are substantially larger than 
a reader accustomed to a uniformly-ionized IGM might 
anticipate. There is also no simple one-to-one relation 
between volume-averaged photoionization rate and neu- 
tral fraction, although the photoionization rate should 
generally increase as a larger fraction of the IGM is ion- 
ized, and each point in the IGM is influenced by more 
ionizing sources. 

In order to give some sense for the model dependence of 
our adopted photoionization rates, we follow Furlanetto 
& Oh (2005) and relate the hydrogen photoionization 
rate to the halo collapse fraction and mean free path to 
ionizing photons. In this model, the (total) proper ioniz- 
ing emissivity is given by e ~ C,(nH)df co \\/dt, where £ is 
the ionizing efficiency parameter of Equation {]]), (tih) is 
the (average) proper abundance of hydrogen atoms, and 
df 'coil/ dt is the time derivative of the collapse fraction. 
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Assuming that the mean free path to ionizing photons 
scales as f 3 / 2 (Zuo & Phinney 1993, Q5A\i , we then have: 
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In this equation, chi is the hydrogen photoionization 
absorption cross section at the hydrogen ionization edge, 
which is numerically cthi = 6.3 x 10~ 18 cm 2 . The quan- 
tity All represents the proper mean free path to ion- 
izing photons at the Lyman limit frequency (see £ 15. 41 
for comments on plausible values). The values of C, and 
df coil/ dz above are chosen for M m - ln = 10Af coo i and so 
that (xi) ~ 1 at z ~ 6, assuming the mass function fol- 
lows the Press-Schechter (1974) form. It is interesting 
to note that our fiducial hydrogen photoionization rates 
are a bit higher than the values Fan et al. (2006) infer 
from their mean absorption measurements on the basis 
of the Miralda-Escude et al. (2000) density pdf. Our val- 
ues are, however, more compatible with constraints from 
Becker et al. (2006a) derived using a lognormal model for 
the Lya opacity. We do caution against over-interpreting 
this, given the large uncertainties in our modeling. The 
above expression is only meant to loosely illustrate the 
model dependence of our adopted photoionization rates. 

4.1.2. Quasar Lifetimes and Lightcurves 

Next, let us consider the quasar lifetime and ionizing 
luminosity. In our fiducial model, we adopt t q — 3.6 x 10 7 
years for the quasar lifetime. This is our closest time 
output to a Salpeter time (Salpeter 1964), t s ~ 4.5 x 10 7 
years, assuming a radiative efficiency of e ra d = 0.1 and 
Eddington-limited accretion. This timescale is com- 
parable to observational estimates of quasar lifetimes 
(e.g. Martini 2004) and also roughly to the lifetime de- 
rived from numerical simulations of black hole growth 
(Springel et al. 2005, Hopkins et al. 2005). (In fact, the 
simulations of black hole growth indicate that the quasar 
lifetime depends on luminosity, but the Salpeter time is 
a reasonable measure of the timescale during which most 
of the black hole mass is accumulated; see e.g., Hopkins 
et al. 2006; Hopkins, Richards & Hernquist 2007.) How- 
ever, the quasar lifetime is still uncertain observation- 
ally by ~ 2 orders of magnitude (Martini 2004). Conse- 
quently, we also consider t q ~ i s /10 in which case it is 
generally easier to locate the edge of a quasar proximity 
zone observationally (Sj5]). Note, however, that decreas- 
ing the quasar lifetime increases the challenge of growing 
supermassive black holes by z ~ 6 (Haiman & Loeb 2001, 
Li et al. 2006). In our fiducial model, we adopt an ion- 
izing luminosity of N = 1.8 x 10 57 photons per second, 
corresponding roughly to the mean ionizing luminosity of 
the Fan et al. (2006) z ~ 6 quasar sample, provided their 
spectra match the Telfer et al. (2002) quasar composite 
spectrum. For reference, the brightest quasar in the Fan 
et al. (2006) sample is roughly ~ 2 times brighter than 
this, while the faintest quasar is ~ 3 times fainter. For 
simplicity we assume here that quasars radiate steadily 
at this luminosity for their entire lifetime (but see § |6|). 

Finally, the level of small scale structure in the IGM is 
highly uncertain and can affect our results. We examine 



the impact of small scale structure on our mock absorp- 
tion spectra in the next section: here we present results 
without including any 'subgrid-clumping'. 

4.2. Results from Our ID Calculations 

As input for our ID radiative transfer calculations we 
extract 20 random lines of sight towards each of the five 
most massive halos in our simulation box at z ~ 6.6. 
These halos have masses in the range 1.4 x 10 12 M Q < 
Mh < 3.1 x 10 12 A/ Q . We simulate quasars at z q = 6.42, 
corresponding to the highest redshift source in the Fan 
et al. (2006) sample. We project each sightline at a 
random angle with respect to our simulation box, inter- 
polating the density and peculiar velocity fields (needed 
to construct mock absorption spectra) from our N-body 
simulation. Each sightline wraps around our periodic 
simulation box and extends for 130 co-moving Mpc/h. 
Our 3D calculations are not 'in the light-cone' and so wc 
ignore redshift evolution in the ionization state of the gas 
across each sightline. 

For convenience in performing our radiative transfer 
calculation, and generating mock quasar spectra, this 
interpolation is performed using a fine grid with 1500 
cells. Moreover, we interpolate the HI photoioniza- 
tion rate from our 3D calculations onto each line of 
sight. This specifies a (fluctuating) galactic ionizing 
background which is incorporated as we calculate the 
subsequent ionization from our quasar source. We ap- 
proximate the galactic background ionization as fixed 
over the quasar lifetime, t q ~ t s ~ 4 x 10 7 yrs. From 
the simulated HI photoionization rate, we specify a fluc- 
tuating background Hcl photoionization rate (our 3D cal- 
culations ignore helium and hence do not track Hel pho- 
toionization), assuming a galactic ionizing spectrum that 
follows a v power-law near the HI/Hel photoionization 
edges. We further assume that our galactic sources do 
not contribute any photons capable of doubly ionizing 
helium. Finally, we assume that cells ionized by galaxies 
in our 3D calculations are uniformly heated to T = 10 4 K 
(gas temperature is not tracked in our 3D calculations). 
Neutral cells are assumed to be initially at the CMB tem- 
perature, but our results would be unchanged for any 
other small temperature. 

Our ID radiative transfer calculations follow the C2- 
ray scheme of Mellema et al. (2006a), generalized to 
include helium and to track the thermal state of the gas 
in the IGM. We refer the reader to the Mellema et al. 
(2006a) paper for a description, and we will present de- 
tails of our implementation elsewhere, but a few perti- 
nent comments are as follows. We make use of the fact 
that, along the line of sight in the frame of the observer, 
the quasar front obeys the non-relativistic I-front equa- 
tion, and so there is no need to impose finite speed of light 
constraints (see Cen & Haiman 2000, Bolton & Haehnelt 
2006 and references therein). In order to rapidly com- 
pute photoionization and photoheating rates, we adopt 
approximate forms for the frequency dependence of the 
opacity between the Hel and Hell edges and above the 
Hell photoionization edge. This allows us to compute 
photoionization and photoheating rates from lookup ta- 
bles after specifying just the optical depth in each of 
three frequency bins (see Shapiro et al. 2004 for a simi- 
lar scheme) . We use the thermal and chemical rates from 
Hui & Gnedin (1997). Our code has been tested against 
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Fig. 6. — Photoionization rate of HI and Hell from our ID ra- 
diative transfer calculations as a function of separation (r) from 
the quasar. The left panels show the photoionization rates along 
two random lines of sight through our simulation box, for a case 
in which the quasar turns on when 48% of the simulation volume 
is ionized. The red solid line shows the initial photoionization rate 
contributed by galactic sources, before the quasar itself turns on. 
The black dashed line shows the photoionization rate after the 
quasar has shined for t q ~ t s , reflecting the combined radiation 
field of the quasar and background galaxies. The blue dashed line 
shows the expected location of a quasar HII front in an IGM with 
a uniform ionization level of 48%. The top line of sight intersects 
many neutral patches, and the quasar front halts around the ex- 
pected position in a uniform medium. In the bottom panel, the 
front passes through longer stretches of ionized gas and extends 
further. The green dotted line in the bottom panel shows the 
relatively broad quasar Hell front (for visual clarity we omit the 
Hell curve in the other panels). The right panels are identical for 
sightlines in which 70% of the volume is ionized when the quasar 
turns on. Since quasars are born into large HII regions, the typical 
sightline extends further than expected in a uniform medium. 

the usual analytic solutions for I-lront propagation in a 
homogeneous IGM. 

We show results from our ID calculations in Figure [5] 
for our fiducial model with initial (pre-quasar) volume- 
averaged ionization fractions of (x;) = 0.48 and 0.70. 
Consider first the combined HI photoionization rate from 
galaxies and quasars (indicated by the black dashed lines 
in Figure [5]): at small separations, the quasar strongly 
dominates the photoionization rate and there is a rel- 
atively smooth ~ 1/r 2 fall-off, while the HI photoion- 
ization rate fluctuates significantly at larger distances 
from the quasar. The extent of the ~ 1/r 2 run indi- 
cates the maximum distance the quasar front reaches 
over the quasar lifetime along a given sightline: the front 
halts at some distance upon crossing sufficient quanti- 
ties of neutral gas, and subsequently there is only pho- 
toionization from background galaxies. The photoioniza- 
tion from background galaxies itself fluctuates owing to 
the presence of neutral stretches of gas, since there is a 
distribution of bubble sizes at each stage of reionization 
and the photoionization rate is typically more intense in 



Fig. 7. — Cumulative probability distribution of quasar ionization 
front extents for partly neutral models. Top panel: The cumulative 
probability distribution of quasar ionization front extents in models 
with initial volume-weighted ionization fractions of (xi) = 0.48 
and (xi) = 0.7. Here we assume a quasar luminosity of N = 
1.8 X lO 57 ^" 1 and a quasar lifetime of t q ~ t s . The red dashed line 
indicates the expected quasar front position in a uniform IGM with 
(xi) = 0.48 and t q ~ t s . Bottom panel: The same for a quasar 
lifetime of t q ~ t s /10. In each case, the histograms are quite 
broad owing to the presence of large ionized bubbles created by 
galaxies before quasar turn-on. The fronts generally extend further 
than in a homogeneous IGM of the same ionization fraction, since 
the quasars tend to be born into large HII regions. Occasionally, 
fronts passing through mostly neutral gas extend slightly less far 
than in the uniform case owing partly to overdense gas and helium 
absorption. The scalings for quasar front extent expected in a 

uniform IGM, r p oc t q ^/ < xjji > 1//3 , are inaccurate owing to the 
presence of large galaxy-created HII regions. 

larger ionized bubbles, and because the photoionization 
rate fluctuates from place to place within ionized bub- 
bles. 

The red solid lines in Figure [S] indicate the initial ion- 
ization provided by galaxies before the quasar itself turns 
on. From the example sightlines, it can be seen that 
quasar ionization fronts extend further along sightlines 
with larger levels of 'pre-ionization' from background 
galaxies. For example, the quasar front traverses more 
neutral material, and extends less far, in the top-left 
panel than in the bottom left panel, even though both 
adopt the same quasar lifetime and volume- weighted ion- 
ization fraction. Another interesting feature of Figure [6] 
is that the opacity to ionizing photons at the Lyman- 
limit frequency from residual neutral material amounts 
to t ~ 1 at r ~ 50 co-moving Mpc/fo. Hence, there 
is some fall-off in excess of ~ 1/r 2 for lines of sight in 
which the front extends to large distance, although this 
is slightly disguised in our figure by upward fluctuations 
in the galactic photoionization rate. In £|5.4l we argue that 
our simulations may underestimate this residual opacity. 

The example sightlines of Figure O illustrate a diver- 
sity of quasar front positions at a given with a 
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strong dependence on the initial galaxy-generated ion- 
ization along each sightline, but we require a statistical 
measure to characterize this quantitatively. In Figure [7| 
we show the cumulative pdf of ionization front extent, av- 
eraged over our ensemble of 100 sightlines for two partly 
neutral models and two assumed quasar lifetimes. Here, 
and in what follows, we show results in units of proper 
Mpc to match the convention of previous proximity-zone 
modelers. The vertical dashed lines in each panel illus- 
trate the expected front position in a uniform IGM with 
(xi) = 0.48. In each case, we see the bias discussed in the 
previous section: since quasars are born into large HII re- 
gions, the fronts extend further on average compared to 
an IGM that is uniformly ionized (with an ionization frac- 
tion equal to the volume-averaged ionization fraction in 
the patchy reionization case). On the other hand, there 
are also occasional sightlines that reach less far than ex- 
pected in a uniform IGM. These sightlines traverse longer 
neutral stretches than the typical sightline. The presence 
of overdense gas, and absorption by helium, also act to 
reduce the front extent compared to the expectation for 
a uniform IGM. For the majority of sightlines, however, 
the fronts travel further than expected in a uniform IGM. 
The average front extends ~ 25% (~ 30%) further in our 
patchy reionized model at (xj) — 0.48 ((x,) = 0.70) - 
assuming t q *~ t s — than in a uniform IGM. 

Moreover, the sightline to sightline scatter in quasar 
front position is quite large. For example, it takes until 
r p ~ 12.9 (18.1) proper Mpc to get to the 84% (97.5%) 
upper limit on quasar front extent when (xj) = 0.48 
and t s ~ t q . Clearly quasar ionization fronts can ex- 
tend much further than in a uniform IGM, where one 
expects r p ~ 7.7 Mpc for (x,) = 0.48, t s ~ t q . The bias 
and scatter are still stronger when the quasar lifetime is 
shorter, as illustrated in the bottom panel of Figure \7\ 
This occurs because when the quasar lifetime is short the 
quasar has less time to ionize the surrounding gas and the 
front extent along an individual sightline is hence even 
more sensitive to its initial pre-quasar ionization state. 
The histograms also illustrate that the scalings expected 
in a uniform IGM, r p oc tq /(xhi) 1 / 3 , are inaccurate: 
the extent of the quasar ionization fronts is sensitive to 
the presence of galaxy-generated ionized pathways. This 
means that one needs a significant data sample to in- 
fer the ionized fraction statistically, as well as a model 
to connect the probability distribution of proximity zone 
sizes with the volume-weighted ionization fraction. 

It is also interesting to compare the ionization front 
extents in these models with the 'proximity' scale. This 
scale is relevant for models in which the entire volume of 
the IGM is ionized and indicates where the photoioniza- 
tion rate from the quasar drops to that of the background 
radiation field. Ignoring residual opacity along the line 
of sight (this may overestimate the proximity scale - see 
£15.41 for comments), this distance is r prox ~ 18 proper 

1/2 

Mpc x (n/1.8 x lO 57 *" 1 ) (rbeknd/lO- 13 *- 1 )" 172 . 

Notice that this scale is comparable to the extent of 
quasar ionization fronts along many of the sightlines in 
our partly neutral models, suggesting that it may be dif- 
ficult to distinguish models on the basis of their Ly-a 
forest proximity zones. Another way to say this is as 
follows: for our fiducial parameters, even in partly neu- 



tral models there is generally little contrast between the 
photoionization rate at the edge of the quasar ionization 
front and that provided by the background galaxies be- 
yond the front (see Figure [6]). As we will see in the next 
section, this makes it challenging to locate the 'edge' of 
the quasar front on the basis of Ly-a absorption spec- 
tra. The contrast between the photoionization rate at 
the front edge and from the background galaxies will 
be larger than in our fiducial model if (Phi) is smaller 
than we assume (background galaxies are less intense), 
or if the quasar radiation field is more intense at the 
front edge. The quasar will be more intense at the front 
edge if the quasar lifetime is shorter (although even for a 
short lifetime quasar fronts will extend to large distances 
along some sightlines, Figure [7]), and more luminous. In 
practice, however, the brightest quasars in the Fan et al. 
(2006) sample are only a factor of ~ 2 brighter than in 
our fiducial calculation. 

Before constructing mock quasar spectra from our ID 
calculations, we briefly discuss helium photoionization 
rates and the thermal state of the IGM (see also Bolton 
& Haehnclt 2006). The galaxies that reionize the IGM in 
our simulations have a soft spectrum and so they mostly 
singly ionize helium, but do not doubly ionize it. Hence, 
before the quasar turns on in our calculations, helium 
is mostly singly ionized and is doubly ionized only by 
the quasar itself. Because of this, the Helll front lags 
behind the HII and Hell fronts. The Helll front is, how- 
ever, rather broad owing to the long mean free paths for 
Hell ionizing photons. Both of these features are illus- 
trated by the green dotted line in the bottom left-hand 
panel of Figure [6l The Hell front (not shown in the fig- 
ure for visual clarity), on the other hand, closely tracks 
the HII front in our calculation. In principle, the posi- 
tion of the quasar Helll front provides interesting infor- 
mation regarding the ionization state of helium and the 
quasar lifetime: we can be fairly confident that HcIII, 
unlike HII, comes from photoionization by the quasar it- 
self. However, Hell Lya forest observations must be done 
in the ultraviolet band and are hence challenging (e.g. 
Shull 2004). Finally, initially neutral regions are raised 
to temperatures of 30 — 40, 000 K by the photohcating 
of HI and Hell (see Abel & Haehnelt 1999, Bolton & 
Haehnelt 2006). This quasar- heated gas recombines less 
rapidly, and is more thermally-broadened than the cooler 
galaxy-ionized gas (which we assume to be initially uni- 
formly heated to 10, 000 K.) 

5. MOCK QUASAR SPECTRA 

At this point, the reader may be curious about two as- 
pects of our line of reasoning. The first point is that, 
starting with Wyithe & Loeb (2004), several authors 
have pointed out that the observed z > 6 proximity zones 
are small and used this fact to argue for a partly neutral 
IGM. In the previous sections, we argued that quasar 
ionization fronts may extend further than assumed by 
these authors in a partly neutral IGM. One might imag- 
ine that this strengthens the case for a partly neutral 
IGM, or even places our results in conflict with observa- 
tions. Second, at least some of the sightlines in Figure [H] 
have sharp ionization fronts and one might assume that 
it would be relatively easy to detect these sharp features 
in quasar absorption spectra. 

In this section we will address both of these points. 
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First, we illustrate that in our fiducial model, the quasar 
is dim enough at the ionization front edge that the aver- 
age transmission through the forest is already very low 
( £|5.1[) . Generally this means that previous measurements 
of quasar ionization front extents are underestimates, as 
pointed out previously by Bolton & Haehnelt (2006) and 
Maselli et al. (2006). Mesinger & Haiman (2004) also 
comment on the difficulty of recovering quasar front po- 
sitions from Lya forest spectra. On the other hand, they 
argue that the Ly/3 region can be used in conjunction 
with the Lya forest to aid in recovering the position 
of ionization fronts, although here one needs to contend 
with the foreground Lya forest. Here, we describe and 
test (on mock spectra) an improved algorithm for deter- 
mining the extent of quasar ionization fronts from Lya 
forest spectra f H5.2p . We then emphasize the strong de- 
pendence of our results on the (highly uncertain) level 
of small-scale structure in the IGM, and other unknowns 

5.1. Average Transmission in the Quasar Proximity 

Zone 

We construct mock quasar spectra from the neutral hy- 
drogen density, temperature, and peculiar velocity fields 
generated from our cosmological simulation, and ID ra- 
diative transfer calculations, in the usual manner (see 
e.g. Hernquist et al. 1996, Hui et al. 1997). Note that 
we calculate the optical depth by convolving with the full 
line profile, including the effects of both thermal broad- 
ening and the natural line width, since damping-wing 
absorption can be significant in our partly neutral sce- 
narios (Miralda-Escude 1998). Since the damping wings 
are quite broad, we extend our sightlines slightly (ignor- 
ing redshift evolution in the ionization state) in order to 
accurately calculate the optical depth near the edge of 
each sightline. 

First, let us consider the average transmission profile 
around several z q — 6.42 quasars hosts drawn from our 
simulation. Specifically, we calculate the transmission 
using the ID radiative transfer calculations of the previ- 
ous section along each of 100 lines of sight, which were 
assembled from 20 random lines of sight towards each 
of the five most massive halos in our simulation box. 
Clearly this is for illustrative purposes only, as present 
data samples consist of only 3-4 quasars with z > 6.2 
(Fan et al. 2006). 3 

To begin with, we consider our patchy reionization 
model with (x^ — 0.7, and restrict our analysis to the 
blue side of the quasar Lya emission line. The 100- 
sightline averaged transmission profile for this model is 
indicated by the solid black line in the top panel of Fig- 
ure [H The sightline averaged transmission falls rapidly 
with increasing distance from the quasar, a direct reflec- 
tion of the decreasing quasar ionizing flux, which falls off 
as FHi.qso oc 1/r 2 . Indeed, at ~ 6 Mpc proper from the 
quasar the average transmission is only at the 10% level. 
At still larger separations from the quasar there is only 
a small residual flux, on average. The transmission at 
large separations results partly from sightlines that in- 
tersect galaxy-created HII regions beyond the extent of 

3 One of the 4 existing z > 6.2 quasars is a broad absorption 
line quasar, and has not been used in studies of the IGM (Fan et 
al. 2006, Mesinger & Haiman 2006). 
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Fig. 8. — Mock quasar spectra drawn from each of two partly 
neutral models and a completely ionized IGM. Bottom panel: The 
black solid line shows the transmitted flux as a function of distance 
from the quasar, averaged over a large ensemble of quasar spectra 
for a patchy-reionization model with (xi) = 0.7. The red, green, 
and blue histograms show the transmission along several example 
sightlines from the patchy model, smoothed as in Fan et al. (2006). 
The cyan dashed histogram shows an example sightline from a 
completely ionized IGM {(xi) = 1) with the same mean hydrogen 
photoionization rate as our patchy models. The red dashed line 
shows the expected ionization front extent in a uniform model with 
(xi) = 0.7. In a ll ca ses, there is transmitted flux beyond this 
scale, although in i|5,3l we show that this depends on the amount of 
small-scale structure in the IGM. This transmission occurs because 
quasar fronts tend to extend further in the patchy case than in the 
uniform, low-level ionization model, and because galaxy-created 
HII regions beyond the extent of the quasar front can transmit 
flux. It is hard to distinguish between the highly-ionized and partly 
neutral models. Top panel: Identical to the bottom panel for a 
model with (xi) = 0.48. 

the quasar HII front, and partly from sightlines where the 
quasar HII front itself reaches large distances by passing 
through several galaxy-generated HII regions. The trans- 
mission extends significantly further than quasar fronts 
do in a uniform IGM of the same ionization fraction, as 
illustrated by the vertical red dashed line in the figure 
(see ^5.31 for caveats). These features just illustrate the 
impact of pre-quasar HII regions and the extended and 
fluctuating photoionizing flux of Figure [6] on the trans- 
mission in the Lya forest. 

The top panel of Figure [5] is very similar to the bot- 
tom one, except that in this panel the mock spectra are 
drawn from a model where reionization is less advanced 
when the quasar turns on ((xi) — 0.48). Comparing the 
solid line in the two panels, one sees only very subtle 
differences in the average transmission between the two 
models. 

Note that in our fiducial model, we do typically get 
some transmission through the Lya forest beyond the 
quasar range of influence even when the IGM is 50% 
neutral (Furlanetto et al. 2004b). This transmission 
may be overestimated if the IGM is more clumpy than 
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in our model, as we argue subsequently, or if we over- 
estimate the galactic photoionizing background. In or- 
der to get transmission through the IGM when it is half 
neutral, there must also be sufficiently long stretches 
of ionized gas so that the damping wings from neutral 
gas on each side of an ionized stretch do not overlap 
(Miralda-Escude 1998). If we approximate the IGM as 
homogeneous on each side of an ionized stretch, with 
neutral fraction Xjji, the damping wing opacity (includ- 
ing both redward and blueward of the ionized stretch) 
amounts to Td w = 1 at the center of the ionized re- 
gion of size L ~ 4:XHi^aR a nH(z){c/ H(z)) 2 /n (Miralda- 
Escude 1998). Here, a a is the Lya cross-section, R a = 
2.02 x 10~ 8 is related to the natural line width, and the 
other symbols have their usual meanings. When the IGM 
is 50% neutral, this estimate - which should be an overes- 
timate close to the quasar - requires a stretch of L ~ 1.8 
proper Mpc. Paths of this length are not rare in our cal- 
culations when the IGM is 50% neutral. Hence, resonant 
absorption may be more of an obstacle for transmission 
in the partly neutral IGM, than damping wing absorp- 
tion (Furlanetto et al. 2004b). 

Let us now consider transmission spectra on a 
sightline-by-sightline basis. This is more observationally 
relevant than the 100 sightline-averaged curve described 
above, given the limited observational samples available 
presently. We follow Fan et al. (2006) and top-hat filter 
our mock spectra at 20 A resolution, even though their 
spectra have intrinsically higher resolution. This extra 
smoothing is performed because as emphasized by Fan et 
al. (2006), the proximity scale, defined as the first time 
the transmission crosses some threshold value, is natu- 
rally sensitive to the smoothing scale of the observations. 
Example lines of sight are shown by the solid, colored 
histograms in Figure [H For comparison, we also show a 
sample line of sight (cyan dashed histogram) drawn from 
a model in which the entire volume of the IGM is ionized. 
One can immediately see that it is difficult to distinguish 
between different models for the volume-ionized fraction 
on the basis of a few absorption spectra alone. The main 
obstacle is that the transmission is generally very low 
close to the edge of an ionization front in our models. 
Comparing with the highest redshift bin in Figure 14 of 
Fan et al. (2006) it seems, however, that each of our 
models overproduces the transmission at large distances 
in comparison to the observations. We return to this 
shortly in ^5.31 

First, in order to further examine our ability to dis- 
criminate between models, we construct histograms of 
first-threshold crossing distances from our smoothed 
spectra. This has been used by Fan et al. (2006) and oth- 
ers as a diagnostic for ionization-front position. Here we 
simply record the closest distance to the quasar at which 
the smoothed-transmission falls below a given threshold, 
^thresh- Using Fthrcsh = 0.1 as in Fan et al. (2006), we 
find little difference between models. This is because the 
IGM is opaque enough at these redshifts that the trans- 
mission generally falls below this threshold even within 
the quasar ionization front, making this measure a poor 
tracer of ionization front extent. Using a smaller thresh- 
old is an improvement, but it is still very difficult to dis- 
tinguish between models since even the smoothed spec- 
trum frequently falls below the threshold before reaching 



the front edge. If one chooses a still smaller threshold, 
then one picks up transmission from background galax- 
ies and misidentifies the front position. Smoothing each 
spectrum with a 20 A filter also partly washes out dis- 
tinctions between models, since the transmission spikes 
are generally very narrow. 

5.2. Improved Algorithm for Determining Quasar-Front 

Extents 

Can we devise a better algorithm to locate the extent 
of quasar ionization fronts and distinguish between dif- 
ferent models? Our basic task is to identify a scheme 
for locating an 'edge' in a quasar absorption spectrum: 
shortward of the edge, the quasar contributes to the ion- 
izing flux, while longward of the edge only background 
galaxies contribute. We propose a simple variant of the 
Fan et al. (2006) first threshold-crossing scheme, based 
on locating the furthest distance from a quasar, beyond 
which the (un-smoothed) transmission crosses a given 
threshold. In locating the 'last-crossing' distance, we 
consider only pixels within 20 proper Mpc of the quasar. 
We find that this 'last-crossing' scheme is a better diag- 
nostic of quasar front position. It is also advantageous to 
use the full un-smoothed spectrum to preserve as much 
information as possible, although some smoothing will 
be necessary for sufficiently noisy data. 

The basic idea of our last-crossing threshold scheme is 
illustrated by the example sightlines shown in Figure [HI 
A clean case is shown by the sightline in the left-hand 
panels, drawn from our fiducial model with (xt) = 0.48. 
Here, the transmission fluctuates and is reasonably sub- 
stantial before declining beyond r ~ 8 — 9 proper Mpc. 
In this case, the last place along the line of sight where 
the transmission crosses a threshold of Fthresh = 0.1 is 
a reasonable indicator of quasar ionization front extent, 
as can be seen in the figure. Along this sightline, it is a 
slight underestimate (by ~ 10%), owing both to damping 
wing absorption and since one requires a sufficiently un- 
derdense region to allow transmission through the forest. 
(Note that peculiar velocities also impact the mapping 
between observed and actual position.) Note that there 
is also a substantial galactic component to the ionizing 
flux at larger distances along this sightline. In this case, 
the surrounding gas is, however, too dense to transmit 
flux through the Lya forest. 

Along other sightlines, it can be quite difficult to locate 
the position of the quasar ionization front, and even our 
improved algorithm can lead to inaccurate results. This 
is illustrated by the example sightline in the right-hand 
panels of Figure 03 Along this line of sight, the quasar 
ionization front extends to a larger distance (r ~ 15 
proper Mpc), but fails to produce a transmission spike 
- i.e., transmission exceeding our flux threshold - be- 
yond r ~ 7 — 8 proper Mpc. This is again the difficulty 
mentioned in the previous sub-section: the transmission 
at the quasar ionization front edge is generally low, and 
there is no guarantee of surrounding gas that is under- 
dense enough to transmit any flux. If we lower the flux 
threshold slightly, our algorithm will pick up the more 
distant quasar-created transmission spike. However, if 
we lower the threshold too much, our algorithm will be- 
gin to identify the still more distant galaxy-generated 
spikes. 

In order to characterize the accuracy of our algorithm 
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Fig. 9. — Illustrative examples of our front-locating algorithm. 
Here we show an example sightline where our method performs 
well, and an example where it performs poorly. Each sightline is 
drawn from our model with (x^ = 0.48. Left-hand panels: Well- 
recovered front position example. The ionization field (bottom 
panel), and (un-smoothed) transmission through the Lya forest 
(top panel) for a sightline where the edge of the quasar ionization 
front is recovered relatively accurately. As in Figure [6] the red 
solid line shows the pre-quasar photoionizing field, while the black 
dashed line shows the combined quasar and galaxy field. The green 
dashed line in the top panel indicates an example flux threshold 
used in recovering the front position. The last place where the 
transmission crosses the flux threshold is close to the true front 
position, which corresponds to where the photoionization rate first 
drops off in the bottom panel (near ~ 8 — 9 proper Mpc). Along 
this sightline, in spite of the significant galactic photoionizing back- 
ground, there is no gas that is sufficiently underdense to transmit 
flux at greater separations from the quasar. Right-hand panels: 
Poor-recovery example. In this case, the quasar front extends to 
a considerably larger distance (r ~ 15 proper Mpc). The IGM, 
however, is opaque enough to prevent transmission in excess of the 
flux threshold beyond r ~ 8 proper Mpc. Consequently, along 
this sightline the true front position is underestimated by nearly a 
factor of ~ 2. 

more quantitatively, we locate the last threshold crossing 
position for each of 100 sightlines drawn from our fidu- 
cial model at (xi) — 0.48 and compare this with the true 
quasar front position. The results of this test are shown 
in the bottom panel of Figure [TDJ Concentrating on the 
locus of points near r < 8 proper Mpc, we see that our 
front recovery algorithm works reasonably well for small 
quasar front extents along many sightlines. In this case, 
the true front position is underestimated: this owes to 
damping wing absorption, and since one needs to pass 
through an underdense region to transmit any flux. The 
recovered front position hence, loosely speaking, indi- 
cates the last underdense region within the quasar front. 
The error in the recovered front position can, however, 
be large as one can see from the many points lying above 
the dashed line in the figure that indicates perfect front 
recovery. These points correspond to sightlines where 
the galactic ionizing background beyond the quasar front 
gives rise to transmission through the Lya forest. Along 



Fig. 10. — Test of quasar front position recovery, and power to 
distinguish between models. Bottom panel: Scatter plot comparing 
the true quasar front position (x-axis) along a sightline, with the 
front position recovered using our last-crossing algorithm (y-axis). 
We show the recovered front positions for each of three flux thresh- 
olds. The dashed line indicates perfect recovery: points underneath 
the line indicate sightlines where our algorithm underestimates the 
true front position, while points above the line indicate overesti- 
mates. Top panel: The cumulative pdf of last-crossing positions 
(with -Fthrcsh = 0.1) for 100 sightlines drawn from models with 
each of (xi) = 0.48,0.70 and 1 and all other parameters fixed at 
their fiducial values. This illustrates that even if the error on the re- 
covered front position along an individual sightline is large, we can 
still hope to distinguish between models given a sufficiently large 
sample of sightlines. However, the details of this panel are subject 
to several model uncertain ties, particularly the level of small scale 
structure in the IGM (see £|5.3J l, 

these sightlines, the front-locating algorithm is fooled 
into thinking that galaxy-created transmission spikes are 
associated with flux from the quasar, resulting in an over- 
estimate of quasar front position. This source of error is 
reduced by increasing the flux threshold - and hence re- 
ducing the occurrence of 'false positives' - but increasing 
the flux threshold in turn causes one to further underes- 
timate the short-length quasar front positions. Note also 
that quasar ionization fronts really do extend to large 
distances along some sightlines, so it is not viable to 
simply throw sightlines with large front extents out of 
the analysis. Doing so could easily bias our constraints. 
Quantitatively, the true front position is recovered most 
accurately on average for F t hresh = 0.05,0.1, in which 
case the true front position is identified to within ±20% 
along ~ 40% of sightlines for this model. There are how- 
ever significant outliers as one can see clearly from the 
figure. 

Even though the error in front positions from individ- 
ual sightlines may be substantial, one can still hope to 
distinguish between different models statistically, albeit 
in a model-dependent manner. This is illustrated by 
the top panel of Figure [TUJ Here we show the cumu- 
lative pdf of recovered front positions for -Fthrcsh = 0.1 
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at (xi) = 0.48,0.70, and 1, with our other model pa- 
rameters fixed at their fiducial values. In this example, 
we take the idealized case of 100 quasar spectra with 
perfect signal-to-noise, resolution and identical quasar 
luminosity and lifetime. One can clearly see the ten- 
dency towards larger last threshold crossing distances as 
a greater fraction of the volume is ionized, as expected. 
In the context of this simplified calculation where, for 
example, all of our quasars have identical lifetimes and 
luminosities, we expect ~ 10 (30) spectra are needed to 
rule out a completely ionized model at ~ 1 — a (2 — a) if 
the real universe has (xj) = 0.48. 4 

As we emphasize in the next subsection, there are im- 
portant model uncertainties that could impact these pre- 
dictions. Specifically, the precise mapping between hy- 
drogen photoionization rate and transmission is sensitive 
to the level of small scale structure in the IGM. If there 
is less transmission at a given hydrogen photoionization 
rate than in our model (as may be indicated by the data) , 
then the distance beyond which it becomes hard to locate 
the true front position (r ~ 8 — 9 proper Mpc in Figure 
HOP will shrink, and make it harder to distinguish mod- 
els. On the other hand - in comparison with our fiducial 
case - increasing the quasar luminosity while decreasing 
the quasar lifetime, and/or decreasing the intensity of 
the photoionizing background should make it easier to 
locate ionization front edges. Finally, following Mesinger 
& Haiman (2004), applying our algorithm to the Ly/3 for- 
est may be helpful, although here one needs to deal with 
the foreground Lya forest. Furthermore, while our last- 
crossing algorithm is easy to implement, it is not 'opti- 
mal' in any sense. One might be able to improve on it by, 
for example, recording the length of dark gaps after last 
threshold crossing along each sightline. Moreover, if one 
can indirectly extract the signature of damping wing ab- 
sorption from the data (Mesinger & Haiman 2004, 2006) 
this would be a smoking gun for a partly neutral IGM, 
although uncertainties in the dumpiness of the IGM may 
complicate this effort fi j5.3p . 

5.3. Sensitivity to Model Details 

At this point, we need to emphasize that the 'map- 
ping' between photoionization rate and transmitted flux 
is model dependent, and may not be well captured by 
our moderate resolution N-body simulations. Specifi- 
cally, our transmission profiles depend on the dumpiness 
of the IGM and the mean photoionization rate within 
galaxy-generated HII regions beyond the quasar range of 
influence. 

We begin by considering the dumpiness in the IGM. 
After reionization, the relevant scale is the filtering scale 
(Gnedin & Hui 1998), the distance over which baryonic 
fluctuations are smoothed owing to finite gas pressure. 
According to Gnedin & Hui (1998), the filtering scale can 
easily be a factor of ~ 10 smaller than the linear Jeans 

4 This significance is estimated by drawing many random realiza- 
tions of N Bpec last crossing distances from the (xi) = 0.48 model, 
which is our assumed 'true' model. For each set of N Bpcc last cross- 
ing distances, we calculate their cumulative probability distribution 
and the Kolmogorov-Smirnov probability that our (x;) = 1 model 
is drawn from the true model. We estimate that N sp0 c spectra are 
required to distinguish the two models at 1 — cr (2 — <r) when 84% 
(97.5%) of realizations have a Kolmogorov-Smirnov probability of 
16% (2.5%). 



scale close to reionization, since the gas takes some time 
to respond to prior photoheating. In our calculations, we 
assume that the baryons directly trace the dark matter, 
but density fluctuations are smoothed out by the finite 
resolution of our simulation. The Nyquist frequency of 
our simulation grid is &Nyq ~ 25/i Mpc . For compar- 
ison, if the gas in the iGM is isothermal with a tem- 
perature of ~ 10 4 i"T at z = 6.5, then the linear Jeans 
wavenumber is kj ~ 15.7h Mpc -1 . Given that the fil- 
tering wavenumber hp should be at least several times - 
and potentially more than an order of magnitude - larger 
than the Jeans wavenumber (Gnedin & Hui 1998), we 
likely underestimate the small scale structure in the gas 
density field. This is a consequence of our finite simu- 
lation resolution (&Nyq ~ 25/i Mpc -1 ) and hence, even 
though we assume that the baryonic density field directly 
traces that of the dark matter, we may underestimate the 
dumpiness in the baryon distribution. 

This could impact the results shown thus far in two 
ways. First, we underestimate the number of recombi- 
nations during reionization and, consequently, we under- 
estimate the residual neutral fraction within previously 
ionized cells. Moreover, quasar ionization fronts should 
extend slightly less far than they do in our calculations 
(see the next sub-section, £j5.4[ for related discussions). 
Second, transmission in the high redshift Ly-a forest is 
very sensitive to the abundance of rare voids in the IGM 
(e.g. Oh & Furlanetto 2005, Becker et al. 2006a). Our 
low simulation resolution artificially suppresses the num- 
ber of low-density excursions in the gas density field. 

In order to explore the impact of the first deficiency 
in our modeling, we now incorporate a simple model for 
sub-grid clumping into our calculations. We first esti- 
mate plausible levels for the dumpiness in the baryon 
distribution close to reionization. (For recent related 
calculations see Kohler et al. 2005, Mellema et al. 
2006b, McQuinn et al. 2006a, Trac & Cen 2006; our 
present work is similar to that of McQuinn et al. 2006a.) 
We estimate a volume-averaged clumping factor for the 
baryon distribution by taking Cy = 1 + (Sf), with 
(<$) ~ / d 3 fc/(27r 3 ) exp(-2k 2 /k|)P dm (k) and we adopt 
the Peacock & Dodds (1996) fitting formula for Pdm(fc)- 
This calculation may be an overestimate for two rea- 
sons. First, the relevant quantity for our reionization 
calculations is the clumping factor of the ionized gas; 
our clumping calculation should exclude the highly over- 
dense interior regions of dense clumps which, in reality, 
self-shield and remain neutral. Second, recombinations 
within the massive halos that host ionizing sources are 
- to some extent - incorporated in the escape fraction 
parameter in our 3D calculations, and should not be 
double-counted. Moreover, we should keep in mind that 
our assumed relation between baryonic and dark matter 
fluctuations, Pbar(fc) = exp(— 2k 2 /k|)Pdm(k), is approx- 
imately correct only according to linear theory, and may 
not be accurate in the fully non- linear regime - especially 
if reionization is not yet complete, and heating is highly 
inhomogeneous . 

We contrast our semi-analytic clumping factor esti- 
mates with the variance of our simulated density field 
smoothed on the scale of the simulation mesh, 1 + 
(£ 2 )mesh = l + (^dm)mc S h = 5.3. (The first equality simply 
results because we assume that baryons precisely trace 
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our simulated dark matter density field.) The precise 
value of the filtering scale is quite sensitive to the details 
of the thermal history of the IGM near reionization. As a 
simple example, consider gas that is completely cool be- 
fore some Zheat, at which point it is uniformly raised to 
a temperature To, subsequently cooling as T oc 1/(1 + z) 
(Gnedin & Hui 1998). In this case, k F = 3.4fcj if 
Zhcat = 10, and kp = 9kj if z hcat = 7.5. If the gas is 
then isothermal at 10 4 K at z ~ 6.5 so that kj = 15.7 h 
Mpc -1 , our clumping factor estimates are Cy = 10.2 and 
Cy = 22.8 for kp = 3.4fcj and kp — 9kj respectively. 
These values are comparable to results from the numeri- 
cal simulations of Mellema et al. (2006b) and Trac & Cen 
(2006), but are considerably higher than the fluctuation 
level on our simulation mesh, 1 + (6%) mcs h = 5.3. 

We incorporate this extra small scale structure in our 
calculations by adopting a sub-grid clumping factor. We 
consider two models for sub-grid clumping. In each case 
we adopt a uniform sub-grid clumping factor, ignoring 
any density dependence and any cell-to-cell scatter. Our 
sub-grid models result in clumping factors of Cy = 10 
and Cy — 20, as motivated above. We include the sub- 
grid clumping models only in our ID radiative transfer 
step, and not in the 3D calculations, since McQuinn et al. 
(2006a) found that similar clumping prescriptions do not 
significantly impact the size distribution of HII regions 
during reionization at a given ionization fraction. Incor- 
porating sub-grid clumping boosts the residual neutral 
hydrogen abundance within previously ionized cells by a 
factor of ~ C su b-grid, and results in a modest reduction 
in the extent of quasar ionization fronts. 

We show mock spectra drawn from our models with 
sub-grid clumping in Figure [TT1 Owing to the increased 
residual neutral fraction within ionized cells, increasing 
the amount of small scale structure significantly reduces 
the transmission through the IGM. The models with sub- 
grid clumping more closely resemble the highest redshift 
SDSS quasar spectra from Fan et al. (2006) than the 
calculations shown in Figure [3 We have also produced 
a model in which the entire volume is uniformly ion- 
ized with (Thi) = 10 s , and Cy = 10 (not shown 
in plot), which has a nearly identical average transmis- 
sion profile to our (x{) — 0.7, Cy — 10 case. There is 
nothing exotic about this model - we expect this level of 
clumping from our semi-analytic calculations even after 
reionization is complete. Our results suggest then that 
the z > 6.2 Lya proximity zone measurements are in fact 
compatible with a highly ionized IGM as previously sug- 
gested by Bolton & Haehnelt (2006) and Maselli et al. 
(2006). 

An important caveat to these results is that here we 
only (crudely) model the enhanced recombination rate 
from sub-grid clumping; we ignore that our limited sim- 
ulation resolution likely underestimates the abundance 
of rare voids in the baryonic density distribution. This 
effect goes in the opposite direction of the recombina- 
tion enhancement discussed here and leads to an under- 
estimate of the transmission. Interestingly, note that 
since the gas distribution is expected to become more 
smoothed-out as time elapses after reionization, this ef- 
fect - while not necessarily the dominant one - tends 
to imply more absorption the earlier reionization occurs. 
We defer examining this to future work. 

Another interesting feature of our calculations is that 
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Fig. 11. — Sensitivity of mock spectra to clumping in the IGM. 
Top panel: The solid lines show the transmission averaged over 
an ensemble of 100 mock spectra in models with volume- averaged 
clumping factors of Cy = 5 (i.e., no sub-grid clumping), 10 and 20 
(from top-to-bottom, black, red, and blue lines respectively). The 
vertical red dashed line is the same as in previous plots. Bottom 
panel: The histograms show a single example spectrum, smoothed 
as in Fan et al. (2005), drawn from each of the above models. The 
mock spectra are more sensitive to the level of dumpiness than 
to the volume- averaged neutral fraction in the IGM (see Figure [8] 
to see the sensitivity to volume- averaged ionization fraction). In- 
deed, a model with Cy = 10 and a uniform ionizing background 
((xi) = 1) with the same mean intensity ((Thi) = lO" 13 ^" 1 ) yields 
a nearly identical ensemble-averaged transmission curve to our cor- 
responding (xi) = 0.7 model. 



the transmission close to the quasar is sensitive to the 
level of dumpiness in the IGM. Sufficiently near the 
quasar, the photoionization rate is dominated by the 
quasar itself, and any galactic component is essentially 
irrelevant (Figure [H]). This means that transmission close 
to the quasar provides a gauge of small-scale structure in 
the IGM that is insensitive to the details of the (highly 
uncertain) ionizing background (Cen & Haiman 2000). 
Interestingly, the transmission profiles near quasars ap- 
pear to evolve rapidly with redshift around z ~ 6.2 (Fig- 
ure 14 of Fan et al. 2006), possibly a reflection of in- 
creased clumpiness in the IGM at high rcdshifts. Since, 
given the quasar luminosity, one knows Thi close to the 
quasar, one might use this region to calibrate the rela- 
tion between photoionization rate and transmitted flux. 
In doing so, one may need to include the impact of quasar 
overdensities and infall, however (Faucher-Giguere et al. 
2007), and account for uncertainties in the quasar ioniz- 
ing luminosity (Telfer et al. 2002). 

As mentioned in £j4) even for a given model of bub- 
ble sizes, there is considerable uncertainty in the precise 
galactic HI photoionization rate within the ionized bub- 
bles (see Equation [6|) . This impacts the mock absorp- 
tion spectra far from the quasar, where photoionization 
from galactic sources is not overwhelmed by that from 
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Fig. 12. — Sensitivity of mock spectra to the volume- averaged, 
'background' hydrogen photoionization rate. The precise value of 
the background hydrogen photoionization rate is affected by sev- 
eral details of our modeling. The histograms show how mock ab- 
sorption spectra vary in our (xi) = 0.7 case with varying Thi), 
assuming no sub-grid clumping (CV = 5). The transmission pro- 
files are impacted at relatively small distances (r ~ 5 proper Mpc) 
in this example, owing to an upward fluctuation in the galactic 
photoionization rate. 



the quasar itself. This is in contrast to the influence of 
sub-grid clumping which, at least in our simplified model, 
impacts the transmission even near the quasar. In order 
to illustrate this, we vary the volume-averaged HI pho- 
toionization rate around our fiducial value, redo our ID 
radiative transfer calculations, and construct new mock 
spectra. The results of this calculation are shown in Fig- 
ure [T2"l 

Indeed, boosting the mean intensity of the photoioniz- 
ing background by a factor of ~ 2 increases the transmis- 
sion far from the quasar noticeably. Note however that 
if the IGM is more clumpy than in our simulation, as we 
expect from our arguments above, a still larger (Thi) may 
be required to achieve this much transmission. On the 
other hand, diminishing the background intensity by just 
a factor of ~ 2 basically erases the transmission 'spikes' 
far from the quasar. Note that this has little to do with 
the proximity zone itself - changing the photoionizing 
background does not impact absorption spectra where 
the quasar itself dominates the photoionizing flux - and 
the effect is maximized far from the quasar where the 
photoionization from galactic sources dominates. This 
just illustrates the usual sensitivity of absorption spectra 
to the mean photoionization rate, except here we adopt 
a more realistic model for the fluctuations in the pho- 
toionizing background. In principle, we can constrain our 
model for the ionizing sources, bubble size distribution, 
and mean-free path to ionizing photons with these mea- 
surements: some of our models may overproduce these 
transmission spikes compared to the data. In detail, this 



will require a better handle on the gas density distribu- 
tion in the IGM, and on our models for sub-grid clump- 
ing. 

Based on our Figures fTTI and [T2l it seems plausible that 
the strong rcdshift evolution in proximity zone sizes ob- 
served by Fan et al. (2006) (their Figure 14) near z ~ 6.2 
reflects evolution in the level of small scale structure in 
the IGM, the abundance of rare voids in the gas density 
distribution, the intensity of the ionizing background, 
and the cosmic mean gas density, rather than redshift 
evolution in the volume-weighted ionization fraction. At 
this point our models are perhaps too crude to warrant 
a detailed comparison. We intend to return to this issue 
in future work. 

5.4. The Impact of Lyman-limit Systems 

In our previous calculations, quasar photons ionize 
gas at very large distances along some lines of sight, 
r > 15 — 20 proper Mpc. Should we really expect quasars 
to impact the ionization state of gas at such large dis- 
tances? The main concern here is that numerical simula- 
tions under-produce Lyman limit systems owing to inad- 
equate resolution and missing physics (Miralda-Escude et 
al. 1996, Katz et al. 1996, Gardner et al. 1997, Meiksin 
& White 2004, McDonald et al. 2005b, Kohler & Gnedin 
2007, Nagamine et al. 2004, 2007). These studies have 
focused on z ~ 3, but this is likely even more of an 
issue at z ~ 6, especially if some 'mini-halos' manage 
to survive photo-evaporative processes. Missing Lyman- 
limit systems may impact our results in several ways: 
first, pre-quasar HII regions will be smaller at a given 
ionization fraction (Furlanetto & Oh 2005, McQuinn et 
al. 2006b); second, quasar ionizing photons will prop- 
agate less far; and finally, these dense absorbers may 
leave their imprint on Ly-a absorption spectra. Note 
that these effects are not well-captured by the dumping- 
factor approach of the previous section, which is more 
appropriate for 'diffuse' gas in the IGM (Furlanetto & 
Oh 2005, McQuinn et al. 2006a). 

Let us begin by asking the question: how large should 
we expect the mean free path for ionizing photons to 
be at z ~ 6? (For closely related calculations, see the 
Appendix of Furlanetto & Oh 2005.) Here, we take an 
empirically-motivated approach and estimate the mean 
free path of quasar ionizing photons from the observed 
column-density distribution in the z ~ 3 Ly-a forest, 
and extrapolate this result to higher redshifts. These 
quantities are very uncertain even at z ~ 3. At z ~ 3 
the essential difficulty is that absorption systems on the 
flat part of the curve of growth, where it is difficult to 
accurately determine column densities, contribute signif- 
icantly to the effective optical depth for ionizing pho- 
tons (see, e.g., Meiksin & Madau 1993). Owing to large 
amounts of saturated absorption in high redshift quasar 
spectra, the column density distribution is still more un- 
certain at high redshifts (z > 5), where Lyman-limit sys- 
tems cannot be directly identified. Moreover, one might 
object that the relevant calculation is the mean free path 
of ionizing photons in the overdense environment of a 
quasar host halo, with gas subjected to the enhanced ion- 
ization field from the quasar, while we will calculate only 
the mean free path derived from column density distri- 
butions measured in typical regions of the IGM. 

These objections are correct in detail, but our ensuing 
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Fig. 13. — Mean free path for hydrogen ionizing photons. The 
curves indicate the proper mean free path of ionizing photons at 
the Lyman-limit frequency as a function of redshift, derived from 
two different models for the column-density distribution in the Ly- 
ot forest, extrapolated to high redshift. The red curve adopts the 
Medium Attenuation model of Meiksin & Madau (1993), while the 
blue dotted curve assumes the model of Haardt & Madau (1996). 
For both models, the mean free path of Lyman limit photons is 
r < 5 proper Mpc by z ~ 6. 

calculations should nonetheless provide useful estimates. 
The mean free path can be calculated from the observed 
column density distribution of Lya absorbers, assuming 
they are Poisson distributed. In this case, the effective 
optical depth to photons at the ionization edge is given 
by (see e.g., Zuo & Phinney 1993): 

T cff (A = 912 A, z min , z max ) = I dz x 

2mm 

J dN m f(N m , z) (1 - e-WHioa) . (7 ) 

The mean free path is then determined by the redshift 
extent, z max — z m j n , over which r e ff = 1. The quan- 
tity /(./Vhi, z) represents the column density distribution. 
The mean free path for higher energy photons is natu- 
rally longer, although it scales less rapidly with frequency 
than in a homogeneous IGM, where one expects A oc v z . 
For example, in the toy case that the column density dis- 

3/2 

tribution follows a power law relation, /(-/Vhi) oc A hi , 
out to arbitrarily large column densities, the mean free 
path scales as A oc ^ 3 / 2 (Zuo & Phinney 1993). Note 
that in Equation ([7]) we include both the residual opac- 
ity from the diffuse, ionized IGM (which is presumably 
incorporated in our calculations), and the opacity from 
Lyman-limit systems, which is poorly-captured in our 
simulations. 

We calculate the mean free path from Equation us- 
ing two different parameterization^ for the observed col- 
umn density distribution. Specifically, we consider the 
Medium Attenuation model of Meiksin & Madau (1993), 



and the model from Haardt & Madau (1996). The differ- 
ences between these models reflect uncertainties in mea- 
surements of the column density distribution. We simply 
assume the redshift scalings in these parameterized mod- 
els, based on observations at z ~ 2 — 4, apply even out 
to z ~ 6. The results of this calculation are shown in 
Figure [T31 

In the Haardt & Madau (1996) model, the mean free 
path for ionizing photons at the Lyman-limit frequency 
is only A ~ 3 proper Mpc at z ~ 6, and in the Meiksin 
& Madau (1993) model the mean free path is only A ~ 5 
proper Mpc by z = 6. Higher energy photons have 
longer mean free paths as described previously. Note 
that Fan et al. (2002, 2006) estimate the mean free path 
at z ~ 6 directly from absorption spectra data. These 
inferences are model dependent, but suggest still shorter 
mean free paths than our extrapolations. Low-column 
density systems and dense absorbers (Nm ^ 10 17 cm~ 2 ) 
provide comparable contributions to the net opacity for 
Lyman-limit photons. For comparison, quasar photons 
at the Lyman-limit in our ID radiative transfer calcula- 
tions have a mean free path of A > 10 proper Mpc (once 
the IGM is ionized - i.e., this mean free path is set by the 
opacity from residual neutral material along a sightline) . 

These calculations suggest that our ID calculations 
overestimate the mean-free path of Lyman limit photons 
by a factor of at least ~ 2. Note that a Lyman-limit 
system with A/hi ~ 10 17 cm -2 , by definition, produces 
an opacity of only r ~ 1 and hence should attenuate im- 
pinging quasar flux, without entirely halting the quasar 
ionization front. Nevertheless, our estimates suggest that 
even if the entire volume of the IGM is ionized, we may 
easily be overestimating the quasar flux at r ~ 10 proper 
Mpc by a factor of e or e 2 . As emphasized previously, 
reducing the quasar flux by even a factor of a few should 
lead to significantly more Ly-a absorption and make it 
harder to distinguish a partly neutral from a fully-ionized 
IGM on the basis of Ly-a absorption spectra. Crossing 
a dense absorber will lead to a Lyman-limit absorption 
system in a Ly-a forest spectrum, but this will be hard 
to distinguish sufficiently far from the quasar where the 
typical level of absorption is large. 

6. CONCLUSIONS 

In this paper we argued that, even if the z ~ 6 
SDSS quasars form when the IGM is partly neutral, 
they are likely born into large HII regions (see also Yu 
& Lu 2005). On spherical average, the ionization frac- 
tion approaches the global mean after averaging over 
R ~ 20 — 30 co-moving Mpc/h. Nonetheless, much 
longer skewers towards quasar host halos pass entirely, 
or predominantly, through ionized bubbles. These ef- 
fects allow quasar fronts to travel further than expected 
in a uniformly-ionized medium of the same ionized frac- 
tion. We find that the presence of large galaxy-created 
HII bubbles in patchy reionization models leads to a 
significant sightline-to-sightline scatter in the extent of 
quasar ionization fronts. This complicates distinguishing 
models on the basis of their observed proximity zones. 
Our results also contradict the expectation for quasar 
fronts propagating into a homogeneous IGM, in which 
case front position scales with neutral fraction simply as 

R s oc X m . 

We emphasized that the resulting quasar absorption 
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spectra are rather sensitive to uncertainties in the level 
of small scale structure in the IGM, which impact the 
mapping between photoionization rate and transmission. 
In accord with several previous studies (Fan et al. 2006, 
Bolton & Haehnelt 2006, Maselli et al. 2006), we find 
that transmission profiles around the z ~ 6 SDSS quasars 
do not generally reveal the extent of quasar ionization 
fronts. The transmission simply falls close to zero when 
the combined radiation from the quasar and background 
galaxies is low enough - essentially, the usual problem 
that small residual neutral fractions lead to complete 
absorption in Lya at high redshift (see also Mesinger 
& Haiman 2004). The precise neutral fraction at which 
complete absorption occurs, and the resulting absorption 
spectra, are sensitive to the level of small scale structure 
in the IGM, the abundance of rare voids in the IGM, the 
mean-free path of ionizing photons in the IGM, and the 
precise photoionization rate within ionized bubbles. 

Consequently, we encourage further theoretical study 
regarding the detailed gas density distribution in the 
IGM (Miralda-Escude et al. 2000) and sub-grid clump- 
ing factor prescriptions (e.g. Kohler et al. 2005, Trac 
& Cen 2006). These models can be constrained by com- 
paring with measurements of the one-point probability 
distribution function in the z > 5 Ly-a forest (Becker 
et al. 2006a), by examining the region in the Lya for- 
est where the photoionization rate is dominated by the 
quasar itself (see £15.31 Cen & Haiman 2000), and through 
other measures. These studies should be very useful for 
improved theoretical modeling of z ~ 6 quasar prox- 
imity zones, and allow more definitive constraints on 
the volume-filling factor of HII regions near z ~ 6. It 
might also be interesting to investigate Ly/3 absorption 
in the proximity zone (Mesinger & Haiman 2004, Bolton 
& Haehnelt 2006), and other methods of identifying or 
constraining the presence of damping wing absorption in 
the observed spectra (Mesinger & Haiman 2006). 

There are a few other additional uncertainties that 
warrant further attention. First, the quasar lifetime is 
very uncertain observationally. Moreover, the 'light bulb' 
model adopted here in which the quasar radiates at a 
fixed luminosity for its entire lifetime is likely too simplis- 
tic (Hopkins et al. 2005, 2006). It might be interesting 
to use the z ~ 6 black-hole growth simulations of Li et 
al. (2006), to provide a more realistic description of high 
redshift quasar life cycles. Observationally, uncertainties 
in quasar redshifts may be significant (e.g. Shen et al. 
2007) and impact proximity zone measurements. More- 
over, uncertainties from continuum-fitting - particularly 
close to the quasar Lya emission line - warrant further 
investigation. 

We mention several possible future research directions. 
First, many of the constraints on the z ~ 6 IGM adopt a 
simple model - where one assumes a homogeneous pho- 
toionizing background, a uniform temperature-density 
relation, and that the baryonic distribution is uniformly 
Jeans-smoothed - which, while remarkably successful at 
z ~ 3 (e.g. McDonald et al. 2005b, Viel & Haehnelt 
2006), should break down close to reionization. Demon- 
strating that this model definitively fails at z ~ 6, and 



identifying which features of the model break, would 
clearly strengthen the case for reionization activity near 
z ~ 6. On the other hand, the detailed large scale reion- 
ization simulations that have been done to date and com- 
pared with quasar absorption spectra (e.g. Kohler et al. 

2005) have focused on only a single model. It is hard 
to determine how well alternate models, in which reion- 
ization is more or less progressed, would match observa- 
tions. Indeed, current measurements appear surprisingly 
consistent with the simple assumption of a spatially uni- 
form ionizing background (Lidz et al. 2006a, Liu et al. 

2006) . It would be interesting to examine whether highly 
patchy reionization might overproduce the sightline-to- 
sightline scatter in the mean absorption. Also, there are 
certainly dark gaps in the spectra of z ~ 5.5 quasars: are 
we sure of the conventional wisdom that the entire vol- 
ume of the IGM is ionized by z ~ 5.5? The current line 
of argument for (a^) = 1 at z *~ 5.5 comes from assuming 
a uniform Thi, and finding that a large photoionization 
rate is required to match the mean transmission in the 
forest - but this model may lead to misleading conclu- 
sions, as one generally expects transmission in the large 
ionized bubbles that may exist before reionization com- 
pletes (Furlanetto et al. 2006b). The abundance of dark 
gaps in z ~ 6 quasar spectra is a particularly interesting 
diagnostic for future modeling (Fan et al. 2002, Lidz et 
al. 2002, Nusser et al. 2002, Gallerani et al. 2006, Fan 
et al. 2006). 

We also intend to consider 21 cm imaging of quasar 
HII regions (Madau et al. 1997, Wyithe et al. 2005b). 
Here, one attempts to detect quasar HII regions based on 
the brightness temperature contrast between ionized gas 
around a known quasar and surrounding neutral gas. It 
would be interesting to investigate whether surrounding 
galaxy-generated HII regions complicate the morphology 
of these holes in the 21 cm sky. 

Although we have focused on subtleties and challenges 
associated with interpreting quasar absorption spectra 
at z ~ 6, we should reiterate that they currently provide 
our most extensive data set regarding the high redshift 
IGM. Moreover, the rapid redshift evolution seen by Fan 
et al. (2006) in most of the statistical properties of the 
z ~ 6 Lya forest is certainly striking, even if the precise 
theoretical implications are unclear. We believe that fu- 
ture modeling, along the lines mentioned above, should 
help us understand the ionization state of the IGM at 
z ~ 6 and establish more definitive predictions for future 
surveys of the high redshift IGM. 
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